Posts

We might be missing some key feature of AI takeoff; it'll probably seem like "we could've seen this coming" 2024-05-09T15:43:11.490Z
AI alignment researchers may have a comparative advantage in reducing s-risks 2023-02-15T13:01:50.799Z
Moral Anti-Realism: Introduction & Summary 2022-04-02T14:29:01.751Z
Moral Anti-Epistemology 2015-04-24T03:30:27.972Z
Arguments Against Speciesism 2013-07-28T18:24:58.354Z

Comments

Comment by Lukas_Gloor on The Value Proposition of Romantic Relationships · 2025-06-03T18:33:40.980Z · LW · GW

Thanks, that's helpful context! Yeah, it's worth flagging that I have not read Duncan's post beyond the list.

Duncan's post suggests that different people in the same social context can view exercises from this list either as potentially humiliating comfort-zone-pushing challenges, or as a silly-playful-natural thing to do.

Seems like my reaction proved this part right, at least. I knew some people must find something about it fun, but my model was more like "Some people think comfort/trust zone expansion itself is fun" rather than "Some people with already-wide comfort/trust zones find it fun to do things that other people would only do under the banner of comfort/trust zone expansion." 

(Sometimes the truth can be somewhere in the middle, though? I would imagine that the people who would quite like to do most of the things in the list find it appealing that it's about stuff you "don't normally do," that it's "pushing the envelope" a little?)

That said, I don't feel understood by the (fear of) humiliation theme in your summary of Duncan's post. Sure, that's a thing and I have that as well, but the even bigger reason why I wouldn't be comfortable going through a list of "actions to do in the context of a game that's supposed to be fun" is because that entire concept just doesn't do anything for me? It just seems pointless at best plus there's uncomfortableness from the artificiality of it? 

As I also wrote in my reply to John:

It's hard to pinpoint why exactly I think many people are highly turned off by this stuff, but I'm pretty sure (based on introspection) that it's not just fear of humiliation or not trusting other people in the room. There's something off-putting to me about the performativeness of it. Something like "If the only reason I'm doing it is because I'm following instructions, not because at least one of us actually likes it and the other person happily consents to it, it feels really weird."

(This actually feels somewhat related to why I don't like small talk -- but that probably can't be the full explanation because my model of most rationalists is that they probably don't like small talk.)

Comment by Lukas_Gloor on The Value Proposition of Romantic Relationships · 2025-06-03T18:23:05.363Z · LW · GW

I was initially surprised that you think I was generalizing too far -- because that's what I criticized about your quoting of Duncan's list and in my head I was just pointing to myself as an obviously valid counterexample (because I'm a person who exists, and fwiw many but not all of my friends are similar), not claiming that all other people would be similarly turned off. 

But seeing Thane's reply, I think it's fair to say that I'm generalizing too far for using the framing of "comfort zone expansion" for things that some people might legitimately find fun. 

As I'm going to also write in my reply to Thane, I knew some people must find something about things like the ASMR exampe fun, but my model was more like "Some people think comfort/trust zone expansion itself is fun" rather than "Some people with already-wide comfort/trust zones find it fun to do things that other people would only do under the banner of comfort/trust zone expansion." Point taken! 

Still, I feel like the list could be more representative to humanity in general by not using so many examples that only appeal to people who like things like circling, awkward social games, etc.

It's hard to pinpoint why exactly I think many people are highly turned off by this stuff, but I'm pretty sure (based on introspection) that it's not just fear of humiliation or not trusting other people in the room. There's something off-putting to me about the performativeness of it. Something like "If the only reason I'm doing it is because I'm following instructions, not because at least one of us actually likes it and the other person happily consents to it, it feels really weird." 

(This actually feels somewhat related to why I don't like small talk -- but that probably can't be the full explanation because my model of most rationalists is that they probably don't like small talk.) 

Comment by Lukas_Gloor on The Value Proposition of Romantic Relationships · 2025-06-03T15:34:16.415Z · LW · GW

As this post was coming together, Duncan fortuitously dropped a List of Truths and Dares which is pretty directly designed around willingness to be vulnerable, in exactly the sense we’re interested in here. Here is his list; consider it a definition-by-examples of willingness to be vulnerable: 


I'm pretty sure you're missing something (edit: or rather, you got the thing right but have added some other thing that doesn't belong) because the list in question is about more than just willingness to be vulnerable in the sense that gives value to relationships. (A few examples of the list are fine for definition-by-examples for that purpose, but more than 50% of examples are about something entirely different.) Most of the examples in the list are about comfort zone expansion. Vulnerability in relationships is natural/authentic (which doesn't meant it has to feel easy), while comfort zone expansion exercises are artificial/stilted.

You might reply that the truth-and-dare context of the list means that obviously everything is going to seem a bit artificial, but the thing you were trying to point at is just "vulnerability is about being comfortable being weird with each other." But that defense fails because being comfortable is literally the opposite of pushing your comfort zone.

For illustration, if my wife and I put our faces together and we make silly affectionate noises because somehow we started doing this and we like it and it became a thing we do, that's us being comfortable and a natural expression of playfulness. By contrast, if I were to give people who don't normally feel like doing this the instruction to put their faces together and make silly affectionate noises, probably the last thing they will be is comfortable!

[Edited to add:] From the list, the best examples are the ones that get people to talk about topics they wouldn't normally talk about, because the goal is to say true things that are for some reason difficult to say, which is authentic. By contrast, instructing others to perform actions they wouldn't normally feel like performing (or wouldn't feel like performing in this artificial sort of setting) is not about authenticity.

I'm not saying there's no use to expanding one's comfort zone. Personally, I'd rather spend a day in solitary confinement than whisper in friend's ear for a minute ASMR-syle, but that doesn't mean that my way of being is normatively correct -- I know intellectually that the inner terror of social inhibitions or the intense disdain for performative/fake-feeling social stuff isn't to my advantage in every situation. Still, in the same way, those who've made it a big part of their identity to continuously expand their comfort zones (or maybe see value in helping others come out of their shell) should also keep in mind that not everyone values that sort of thing or needs it in their lives. 

Comment by Lukas_Gloor on MichaelDickens's Shortform · 2025-04-24T13:35:52.687Z · LW · GW

At a moderate P(doom), say under 25%, from a selfish perspective it makes sense to accelerate AI if it increases the chance that you get to live forever, even if it increases your risk of dying.

If you're not elderly or otherwise at risk of irreversible harms in the near future, then pausing for a decade (say) to reduce the chance of AI ruin by even just a few percentage points still seems good. So the crux is still "can we do better by pausing." (This assumes pauses on the order of 2-20years; the argument changes for longer pauses.) 

Maybe people think the background level of xrisk is higher than it used to be over the last decades because the world situation seems to be deteriorating. But IMO this also increases the selfishness aspect of pushing AI forward because if you're that desperate for a deus ex machina, surely you also have to thihnk that there's a good chance things will get worse when you push technology forward. 

(Lastly, I also want to note that for people who care less about living forever and care more about near-term achievable goals like "enjoy life with loved ones," the selfish thing would be to delay AI indefinitely because rolling the dice for a longer future is then less obvioiusly worth it.)

Comment by Lukas_Gloor on OpenAI lost $5 billion in 2024 (and its losses are increasing) · 2025-03-31T11:57:18.748Z · LW · GW

Well done finding the direct contradiction. (I also thought the claims seemed fishy but didn't think of checking whether model running costs are bigger than revenue from subscriptions.)

Two other themes in the article that seem in a bit of tension for me:

  • Models have little potential/don't provide much value.
  • People use their subscriptions so much that the company loses money on its subscriptions.

It feels like if people max out use on their subscriptions, then the models are providing some kind of value (promising to keep working on them even if just to make inference cheaper). By contrast, if people don't use them much, you should at least be able to make a profit on existing subscriptions (even if you might be worried about user retention and growth rates). 

All of that said, I also get the impression "OpenAI is struggling." I just think it has more to do with their specific situation rather than with the industry (plus I'm not as confident in this take as the author seems to be). 

Comment by Lukas_Gloor on AI #108: Straight Line on a Graph · 2025-03-20T16:12:16.943Z · LW · GW

Rob Bensinger: If you’re an AI developer who’s fine with AI wiping out humanity, the thing that should terrify you is AI wiping out AI.

The wrong starting seed for the future can permanently lock in AIs that fill the universe with non-sentient matter, pain, or stagnant repetition.

For those interested in this angle (how AI outcomes without humans could still go a number of ways, and what variables could make them go better/worse), I recently brainstormed here and here some things that might matter.

Comment by Lukas_Gloor on Going Nova · 2025-03-19T14:36:00.516Z · LW · GW

Parts of how that story was written triggers my sense of "this might have been embellished." (It reminds me of viral reddit stories.) 

I'm curious if there are other accounts where a Nova persona got a user to contact a friend or family member with the intent of getting them to advocate for the AI persona in some way. 

Comment by Lukas_Gloor on AI #107: The Misplaced Hype Machine · 2025-03-18T21:12:14.209Z · LW · GW

"The best possible life" for me pretty much includes "everyone who I care about is totally happy"?

Okay, I can see it being meant that way. (Even though, if you take this logic further, you could, as an  altruist, make it include everything going well for everyone everywhere.) Still, that's only 50% of the coinflip.

And parents certainly do dangerous risky things to provide better future for their children all the time.

Yeah, that's true. I could even imagine that parents are more likely to flip coins that say "you die for sure but your kids get a 50% chance of the perfect life." (Especially if the kids are at an age where they would be able to take care of themselves even under the bad outcome.) 

Comment by Lukas_Gloor on AI #107: The Misplaced Hype Machine · 2025-03-18T18:45:33.598Z · LW · GW

Are you kidding me? What is your discount rate? Not flipping that coin is absurd.

Not absurd. Not everything is "maximize your utilty." Some people care about the trajectory they're on together with other people. Are parents supposed to just leave their children? Do married people get to flip a coin that decides for both of them, or do they have to make independent throws (or does only one person get the opportunity)?

Also, there may be further confounders so that the question may not tell you exactly what you think it tells you. For instance, some people will flip the coin because they're unhappy and the coin is an easy way to solve their problems one way or another -- suicide feels easier if someone else safely does it for you and if there's a chance of something good to look forward to.

Comment by Lukas_Gloor on Sentinel's Global Risks Weekly Roundup #11/2025. Trump invokes Alien Enemies Act, Chinese invasion barges deployed in exercise. · 2025-03-18T12:23:16.265Z · LW · GW

Thanks for this newsletter, I appreciate the density of information! 

Comment by Lukas_Gloor on Elon Musk May Be Transitioning to Bipolar Type I · 2025-03-13T16:24:52.988Z · LW · GW

I thought about this and I'm not sure Musk's changes in "unhingedness" require more explanation than "power and fame have the potential to corrupt and distort your reasoning, making you more overconfident." The result looks a bit like hypomania, but I've seen this before with people who got fame and power injections. While Musk was already super accomplished (for justified reasons nonetheless) before taking over Twitter and jumping into politics, being the Twitter owner (so he can activate algorithmic godmode and get even more attention) probably boosted both his actual fame and his perceived fame by a lot, and being close buddies with the President certainly gave him more power too. Maybe this was too much -- you probably have to be unusually grounded and principled to not go a bit off the rails if you're in that sort of position. (Or maybe that means you shouldn't want to maneuver yourself into quite that much power in the first place.)

Comment by Lukas_Gloor on Anthropic, and taking "technical philosophy" more seriously · 2025-03-13T13:29:43.158Z · LW · GW

It feels vaguely reasonable to me to have a belief as low as 15% on "Superalignment is Real Hard in a way that requires like a 10-30 year pause." And, at 15%, it still feels pretty crazy to be oriented around racing the way Anthropic is. 

Yeah, I think the only way I maybe find the belief combination "15% that alignment is Real Hard" and "racing makes sense at this moment" compelling is if someone thinks that pausing now would be too late and inefficient anyway. (Even then, it's worth considering the risks of "What if the US aided by AIs during takeoff goes much more authoritarian to the point where there'd be little difference between that and the CCP?") Like, say you think takeoff is just a couple of years of algorithmic tinkering away and compute restrictions (which are easier to enforce than prohibitions against algorithmic tinkering) wouldn't even make that much of a difference now. 

However, if pausing now is too late, we should have paused earlier, right? So, insofar as some people today justify racing via "it's too late for a pause now," where were they earlier?

Separately, I want to flag that my own best guess on alignment difficulty is somewhere in between your "Real Hard" and my model of Anthropic's position. I'd say I'm overall closer to you here, but I find the "10-30y" thing a bit too extreme. I think that's almost like saying, "For practical purposes, we non-uploaded humans should think of the deep learning paradigm as inherently unalignable." I wouldn't confidently put that below 15% (we simply don't understand the technology well enough), but I likewise don't see why we should be confident in such hardness, given that ML at least gives us better control of the new species' psychology than, say, animal taming and breeding (e.g., Carl Shulman's arguments somewhere -- iirc -- in his podcasts with Dwarkesh Patel). Anyway, the thing that I instead think of as the "alignment is hard" objection to the alignment plans I've seen described by AI companies, is mostly just a sentiment of, "no way you can wing this in 10 hectic months while the world around you goes crazy." Maybe we should call this position "alignment can't be winged." (For the specific arguments, see posts by John Wentworth, such as this one and this one [particularly the section, "The Median Doom-Path: Slop, Not Scheming"].)

The way I could become convinced otherwise is if the position is more like, "We've got the plan. We think we've solved the conceptually hard bits of the alignment problem. Now it's just a matter of doing enough experiments where we already know the contours of the experimental setups. Frontier ML coding AIs will help us with that stuff and it's just a matter of doing enough red teaming, etc." 

However, note that even when proponents of this approach describe it themselves, it sounds more like "we'll let AIs do most of it ((including the conceptually hard bits?))" which to me just sounds like they plan on winging it. 

Comment by Lukas_Gloor on Elon Musk May Be Transitioning to Bipolar Type I · 2025-03-12T01:33:31.121Z · LW · GW

The DSM-5 may draw a bright line between them (mainly for insurance reimbursement and treatment protocol purposes), but neurochemically, the transition is gradual.

That sounded mildly surprising to me (though in hindsight I'm not sure why it did) so I checked with Claude 3.7, and it said something similar in reply to me trying to ask a not-too-leading question. (Though it didn't talk about neurochemistry -- just that behaviorally the transition or distinction can often be gradual.) 

Comment by Lukas_Gloor on Childhood and Education #9: School is Hell · 2025-03-08T21:14:18.259Z · LW · GW

In my comments thus far, I've been almost exclusively focused on preventing severe abuse and too much isolation.

Something else I'm unsure about, but not necessarily a hill I want to die on given that government resources aren't unlimited, is the question of whether kids should have a right to "something at least similarly good as voluntary public school education." I'm not sure if this can be done cost-effectively, but if the state had a lot money that they're not otherwise using in better ways, then I think it would be pretty good to have standardized tests for homeschooled kids every now and then, maybe every two to three years. One of them could be an IQ test, the other an abilities test. If the kid has an IQ that suggests that they could learn things well but they seem super behind other children of their age, and you ask them if they want to learn and they say yes with enthusiasm, then that's suggestive of the parents doing an inadequate job, in which case you could put them on homeschooling probation and/or force them to allow their child to go to public school? 

Comment by Lukas_Gloor on Childhood and Education #9: School is Hell · 2025-03-08T21:02:13.226Z · LW · GW

More concretely, do you think parents should have to pass a criminal background check (assuming this is what you meant by "background check") in order to homeschool, even if they retain custody of their children otherwise?

I don't really understand why you're asking me about this more intrusive and less-obviously-cost-effective intervention, when one of the examples I spelled out above was a lower-effort, less intrusive, less controversial version of this sort of proposal.

I wrote above: 

Like, even if yearly check-ins for everyone turn out to be too expensive, you could at least check if people who sign their kid up for homeschooling already have a history of neglect and abuse, so that you can add regular monitoring if that turns out to be the case. (Note that such background checks are a low-effort action where the article claims no state is doing it so far.)

(In case this wasn't clear, by "regular monitoring" I mean stuff like "have a social worker talk to the kids.") 

To make this more vivid, if someone is, e.g., a step dad with a history of child sexual abuse, or there's been a previous substantiated complaint about child neglect or physical abuse in some household, then yeah, it's probably a bad idea if parents with such track records can pull children out of public schools and thereby avoid all outside accountability for the next decade or so, possibly putting their children in a situation where no one would notice if they deteriorated/showed worsening signs of severe abuse. Sure, you're right that the question of custody plays into that. You probably agree that there are some cases where custody should be taken away. With children in school, there's quite a bit of "opportunity for noticing surface" for people potentially noticing and checking in if something seems really off. With children in completely unregulated homeschooling environments, there could be all the way down to zero noticing surface (like, maybe the evil grandma locked the children into a dark basement for the last two years and no one outside the household would know). All I'm saying is: The households that opt for potential high isolation, they should get compensatory check ins. 

I even flagged that it may be too much effort to hire enough social workers to visit all the relevant households, so I proposed the option that maybe no one needs to check in yearly if Kelsey Piper and her friends are jointly homeschooling their kids, and instead, monitoring resources could get concentrated on cases where there's a higher prior of severe abuse and neglect. 

Again, I don't see how that isn't reasonable.

Habryka claims I display a missing mood of not understanding how costly marginal regulation can be. In turn, I for sure feel like the people I've been arguing with display something weird. I wouldn't call it a missing mood, but more like a missing ambition to make things as good as they can be, think in nuances, and not demonize (and write off without closer inspection) all possible regulation just because it's common for regulation to go too far?

Comment by Lukas_Gloor on Childhood and Education #9: School is Hell · 2025-03-08T18:59:10.527Z · LW · GW

Thanks for elaborating, that's helpful. 

If we were under a different education regime

Something like what you describe would maybe even be my ideal too (I'm hedging because I don't have super informed views on this). But I don't understand how my position of "let's make sure we don't miss out on low-cost, low-downside ways of safeguarding children (who btw are people too and didn't consent to be born, especially not in cases where their parents lack empathy or treat children as not people) from severe abuse" is committed to having to answer this hypotethical. I feel like all my position needs to argue for is that some children have parents/caretakers where it would worse if they had 100% control and no accountability than if the children also spend some time outside the household in public school. This can hold true even if we grant that mandatory public school is itself abusive to children who don't want to be there.

Comment by Lukas_Gloor on Childhood and Education #9: School is Hell · 2025-03-08T15:12:23.625Z · LW · GW

Seriously, -5/-11? 

I went through my post line by line and I don't get what people are allergic to.

I'm not taking sides. I flagged that some of the criticisms of homeschooling appear reasonable and important to me. I'm pretty sure I'm right about this, but somehow people want me to say less of this sort of thing, because what? Because public schools are "hell"? How is that different from people who consider the other political party so bad that you cannot say one nuanced thing about them -- isn't that looked down upon on this site?

Also, speaking of "hell," I want to point out that not all types of abuse are equal and that the most extreme cases of childhood suffering probably happen disproportionally in the most isolated of homes.

How can it not be an ideal to aim for that all children have contact with some qualified person outside their household who can check if they're not being badly abused? Admittedly, it's not understood to be a public school teacher's primary role to notice when something with a child is seriously wrong, but it's a role that they often end up filling (and I wouldn't be surprised if they even get trained in this in many areas). You don't necessarily need public schools for this sort of checking in that serves as a safeguard against some of the most severe kinds of prolonged abuse, but if you just replace public schooling with homeschooling, that role falls away. So, what you could do to get some of the monitoring back: have homeschooling with (e.g.) yearly check-ins with the affected children from a social worker. I don't know the details, but my guess is that some states have this and others don't. (Like the "hit piece" claims, regulation differs from state to state and some are poorly regulated.) I'm not saying I know for sure whether yearly check-ins would be cost-effective compared to other things the state puts money in, but it totally might be, and I doubt that the people who are trying to discourage this discussion (with downvotes and fully-general counterarguments that clearly prove way too much) know enough about this either, to be certain that there are no easy/worthwhile ways to make the situation safer. Like, even if yearly check-ins for everyone turn out to be too expensive, you could at least check if people who sign their kid up for homeschooling already have a history of neglect and abuse, so that you can add regular monitoring if that turns out to be the case. (Note that such background checks are a low-effort action where the article claims no state is doing it so far.)

Comment by Lukas_Gloor on So how well is Claude playing Pokémon? · 2025-03-07T22:50:41.384Z · LW · GW

I got the impression that using only an external memory like in the movie Memento (and otherwise immediately forgetting everything that wasn't explicitly written down) was the biggest hurdle to faster progress. I think it does kind of okay considering that huge limitation. Visually, it would also benefit from learning the difference between what is or isn't a gate/door, though. 

Comment by Lukas_Gloor on Childhood and Education #9: School is Hell · 2025-03-07T19:54:53.391Z · LW · GW

It depends on efficiency of the interventions you'd come up with (some may not be much of a "burden" at all) and on the elasticity with which parents who intend to homeschool are turned away by "burdens". You make a good point but what you say is not generally true -- it totally depends on the specifics of the situation. (Besides, didn't the cited study say that both rates of abuse were roughly equal? I don't think anyone suggested that public schooled kids have [edit: drastically] higher abuse rates than home schooled ones? Was it 37% vs 36%?)

Comment by Lukas_Gloor on Childhood and Education #9: School is Hell · 2025-03-07T15:59:47.895Z · LW · GW

I feel like it's worth pointing out the ways homeschooling can go badly wrong. Whether or not there's a correlation between homeschooling and abuse, it's obvious that homeschooling can cover up particulary bad instances of abuse (even if it's not the only way to do that). So, I feel like a position of "homeschooling has the potential to go very bad; we should probably have good monitoring to prevent that; are we sure we're doing that? Can we check?" seems sensible

The article you call a "hit piece" makes some pretty sensible points. The title isn't something like "The horrors of homeschooling," but rather, "Children Deserve Uniform Standards in Homeschooling." 

Here are some quotes that support this reasonable angle: 

Some children may not be receiving any instruction at all. Most states don’t require home­schooled kids to be assessed on specific topics the way their classroom-based peers are. This practice enables educational neglect that can have long-lasting consequences for a child’s development.

[...]

Not one state checks with Child Protective Services to determine whether the parents of children being home­schooled have a history of abuse or neglect.

[...]

But federal mandates for reporting and assessment to protect children don’t need to be onerous. For example, home­school parents could be required to pass an initial background check, as every state requires for all K–12 teachers.

Those all seem like important points to me regardless of whether the article is right about the statistics you and Eric Hoel criticize. 

(As an aside, I don't even get why Eric Hoel is convinced that the article wanted to claim that homeschooling is correlated with abuse. To me, these passages, including the 36% figure, are not about claiming that homeschooling leads to more abuse than school schooling. Instead, I interpret them as saying, "There's too much abuse happening in the homeschooling context, so it would be good to have better controls." The article even mentions that homeschooling may be the best choice for some children.)

The variance in homeschooling is clearly huge. The affluent rationalists who coordinate with each other to homeschool their kids are a totally different case than aella's upbringing, which is yet again different from the religious fundamentalist Holocaust-denying household where the father has untreated bipolar disorder and paranoid delusions, and the mother makes medically inadequate home remedies to be the sole treatment for injuries of their half a dozen (or so) children who work on the family scrapyard where life-altering injuries were common. See this memoir -- sure, it's an extreme example, but how many times do we not hear about the experiences of children in situations like that, since the majority of them don't break free and become successful writers?

Comment by Lukas_Gloor on AI #106: Not so Fast · 2025-03-06T16:27:44.878Z · LW · GW

Andrej Karpathy recently made a video on which model to use under what circumstances. A lot of it probably won't be new to people who read these AI overviews here regularly, but I learned things from it and it's something I'm planning to send to people who are new to working with LLMs.

Comment by Lukas_Gloor on Ten people on the inside · 2025-03-06T14:11:22.306Z · LW · GW

I want to flesh out one particular rushed unreasonable developer scenario that I’ve been thinking about lately: there’s ten people inside the AI company who are really concerned about catastrophic risk from misalignment. The AI company as a whole pays lip service to AI risk broadly construed and talks occasionally about risk from AGI, but they don’t take misalignment risk in particular (perhaps especially risk from schemers) very seriously. 

[...]

What should these people try to do? The possibilities are basically the same as the possibilities for what a responsible developer might do:

  • Build concrete evidence of risk, to increase political will towards reducing misalignment risk
  • Implement safety measures that reduce the probability of the AI escaping or causing other big problems
  • Do alignment research with the goal of producing a model that you’re as comfortable as possible deferring to.


If the second and third bullet point turn out to be too ambitious for the dire situation that you envision, another thing to invest some effort in is fail-safe measures where you don't expect that your intervention will make things go well, but you can still try to avert failure modes that seem exceptionally bad.

I'm not sure if this is realistic in practice, but it makes sense to stay on the lookout for opportunities.

Comment by Lukas_Gloor on What are the strongest arguments for very short timelines? · 2025-03-06T12:46:04.405Z · LW · GW

Great reply!

On episodic memory:
I've been watching Claude play Pokemon recently and I got the impression of, "Claude is overqualified but suffering from the Memento-like memory limitations. Probably the agent scaffold also has some easy room for improvements (though it's better than post-it notes and tatooing sentences on your body)."

I don't know much about neuroscience or ML, but how hard can it be to make the AI remember what it did a few minutes ago? Sure, that's not all that's between claude and TAI, but given that Claude is now within the human expert range on so many tasks, and given how fast progress has been recently, how can anyone not take short timelines seriously? 

People who largely rule out 1-5y timelines seem to not have updated at all on how much they've presumably been surprised by recent AI progress. 

(If someone had predicted a decent likelihood for transfer learning and PhD level research understanding shortly before those breakthroughs happened, followed by predicting a long gap after that, then I'd be more open to updating towards their intuitions. However, my guess is that people who have long TAI timelines now also held now-wrong confident long timelines for breakthroughs in transfer learning (etc.), and so, per my perspective, they arguably haven't made the update that whatever their brain is doing when they make timelines forecast is not very good.)

Comment by Lukas_Gloor on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T00:59:26.617Z · LW · GW

I liked most thoughts in this post even though I have quite opposite intutions about timelines. 

I agree timeline intuitions feel largely vibes based, so I struggle to form plausible guesses about the exact source behind our different intuitions. 

I thought this passage was interesting in that respect:

Or, maybe there will be a sudden jump. Maybe learning sequential reasoning is a single trick, and now we can get from 4 to 1000 in two more years.

What you write as an afterthought is what I would have thought immediately. Sure, good chance I'm wrong. Still, to me, it feels like there's probably no big difference between "applying reasoning for 3 steps" and "applying reasoning for 1,000 steps." Admittedly, (using your terminology,) you probably need decent philosophical intelligence to organize your 1,000 steps with enough efficiency and sanity checks along the way so you don't run the risk of doing tons of useless work. But I don't see why that's a thing that cannot also be trained, now that we have the chain of thought stuff and can compare, over many answering runs, which thought decompositions more reliably generate accurate answers. Humans must have gotten this ability from somewhere and it's unlikely the brain has tons of specialized architecture for it. The thought generator seems more impressive/fancy/magic-like to me. Philosophical intelligence feels like something that heavily builds on a good thought generator plus assessor, rather than being entirely its own thing. (It also feels a bit subjective -- I wouldn't be surprised if there's no single best configuration of brain parameters for making associative leaps through thought space, such that different reasoners will be better at different problem domains. Obviously smart reasoners should then pick up where they're themselves experts and where they should better defer.)

Your blue and red graphs made me think about how AI appears to be in a thought generator overhang, so any thought assessor progress should translate into particularly steep capability improvements. If sequential reasoning is mostly a single trick, things should get pretty fast now. We'll see soon? :S

Comment by Lukas_Gloor on The Failed Strategy of Artificial Intelligence Doomers · 2025-02-01T00:45:06.004Z · LW · GW

[Edit: I wrote my whole reply thinking that you were talking about "organizational politics." Skimming the OP again, I realize you probably meant politics politics. :) Anyway, I guess I'm leaving this up because it also touches on the track record question.]

I thought Eliezer was quite prescient on some of this stuff. For instance, I remember this 2017 dialogue (so less than 2y after OpenAI was founded), which on the surface talks about drones, but if you read the whole post, it's clear that it's meant as an analogy to building AGI: 

AMBER:  The thing is, I am a little worried that the head of the project, Mr. Topaz, isn’t concerned enough about the possibility of somebody fooling the drones into giving out money when they shouldn’t. I mean, I’ve tried to raise that concern, but he says that of course we’re not going to program the drones to give out money to just anyone. Can you maybe give him a few tips? For when it comes time to start thinking about security, I mean.

CORAL:  Oh. Oh, my dear, sweet summer child, I’m sorry. There’s nothing I can do for you.

AMBER:  Huh? But you haven’t even looked at our beautiful business model!

CORAL:  I thought maybe your company merely had a hopeless case of underestimated difficulties and misplaced priorities. But now it sounds like your leader is not even using ordinary paranoia, and reacts with skepticism to it. Calling a case like that “hopeless” would be an understatement.

[...]

CORAL:  I suppose you could modify your message into something Mr. Topaz doesn’t find so unpleasant to hear. Something that sounds related to the topic of drone security, but which doesn’t cost him much, and of course does not actually cause his drones to end up secure because that would be all unpleasant and expensive. You could slip a little sideways in reality, and convince yourself that you’ve gotten Mr. Topaz to ally with you, because he sounds agreeable now. Your instinctive desire for the high-status monkey to be on your political side will feel like its problem has been solved. You can substitute the feeling of having solved that problem for the unpleasant sense of not having secured the actual drones; you can tell yourself that the bigger monkey will take care of everything now that he seems to be on your pleasantly-modified political side. And so you will be happy. Until the merchant drones hit the market, of course, but that unpleasant experience should be brief.

These passages read to me a bit as though Eliezer called in 2017 that EAs working at OpenAI as their ultimate path to impact (as opposed to for skill building or know-how acquisistion) were wasting their time.

Maybe a critic would argue that this sequence of posts was more about Eliezer's views on alignment difficulty than on organizational politics. True, but it still reads as prescient and contains thoughts on org dynamics that apply even if alignment is just hard rather than super duper hard.

Comment by Lukas_Gloor on DeepSeek Panic at the App Store · 2025-01-29T01:07:33.843Z · LW · GW

If so, why were US electricity stocks down 20-28% (wouldn't we expect them to go up if the US wants to strengthen its domestic AI-related infrastructure) and why did TSMC lose less, percentage-wise, than many other AI-related stocks (wouldn't we expect it to get hit hardest)? 

Comment by Lukas_Gloor on nikola's Shortform · 2025-01-27T23:23:53.845Z · LW · GW

In order to submit a question to the benchmark, people had to run it against the listed LLMs; the question would only advance to the next stage once the LLMs used for this testing got it wrong. 

Comment by Lukas_Gloor on When is reward ever the optimization target? · 2025-01-12T14:59:21.972Z · LW · GW

So I think the more rational and cognitively capable a human is, the more likely they'll optimize more strictly and accurately for future reward.

If this is true at all, it's not going to be a very strong effect, meaning you can find very rational and cognitively capable people who do the opposite of this in decision situations that directly pit reward against the things they hold most dearly. (And it may not be true because a lot of personal hedonists tend to "lack sophistication," in the sense that they don't understand that their own feelings of valuing nothing but their own pleasure is not how everyone else who's smart experiences the world. So, there's at least a midwit level of "sophistication" where hedonists seem overrepresented.)

Maybe it's the case that there's a weak correlation that makes the quote above "technically accurate," but that's not enough to speak of reward being the optimization target. For comparison, even if it is the case that more intelligent people prefer classical music over k-pop, that doesn't mean classical music is somehow inherently superior to k-pop, or that classical music is "the music taste target" in any revealing or profound sense. After all, some highly smart people can still be into k-pop without making any mistake.

I've written about this extensively here and here. Some relevant exercepts from the first linked post:

One of many takeaways I got from reading Kaj Sotala’s multi-agent models of mind sequence (as well as comments by him) is that we can model people as pursuers of deep-seated needs. In particular, we have subsystems (or “subagents”) in our minds devoted to various needs-meeting strategies. The subsystems contribute behavioral strategies and responses to help maneuver us toward states where our brain predicts our needs will be satisfied. We can view many of our beliefs, emotional reactions, and even our self-concept/identity as part of this set of strategies. Like life plans, life goals are “merely” components of people’s needs-meeting machinery.[8]

Still, as far as components of needs-meeting machinery go, life goals are pretty unusual. Having life goals means to care about an objective enough to (do one’s best to) disentangle success on it from the reasons we adopted said objective in the first place. The objective takes on a life of its own, and the two aims (meeting one’s needs vs. progressing toward the objective) come apart. Having a life goal means having a particular kind of mental organization so that “we” – particularly the rational, planning parts of our brain – come to identify with the goal more so than with our human needs.[9]

To form a life goal, an objective needs to resonate with someone’s self-concept and activate (or get tied to) mental concepts like instrumental rationality and consequentialism. Some life goals may appeal to a person’s systematizing tendencies and intuitions for consistency. Scrupulosity or sacredness intuitions may also play a role, overriding the felt sense that other drives or desires (objectives other than the life goal) are of comparable importance.

[...]

Adopting an optimization mindset toward outcomes inevitably leads to a kind of instrumentalization of everything “near term.” For example, suppose your life goal is about maximizing the number of your happy days. The rational way to go about your life probably implies treating the next decades as “instrumental only.” On a first approximation, the only thing that matters is optimizing the chances of obtaining indefinite life extension (potentially leading to more happy days). Through adopting an outcome-focused optimizing mindset, seemingly self-oriented concerns such as wanting to maximize the number of happiness moments turn into an almost “other-regarding” endeavor. After all, only one’s far-away future selves get to enjoy the benefits – which can feel essentially like living for someone else.[12]

[12] This points at another line of argument (in addition to the ones I gave in my previous post) to show why hedonist axiology isn’t universally compelling: 
To be a good hedonist, someone has to disentangle the part of their brain that cares about short-term pleasure from the part of them that does long-term planning. In doing so, they prove they’re capable of caring about something other than their pleasure. It is now an open question whether they use this disentanglement capability for maximizing pleasure or for something else that motivates them to act on long-term plans.

Comment by Lukas_Gloor on Benito's Shortform Feed · 2025-01-02T12:13:44.686Z · LW · GW

I like all the considerations you point out, but based on that reasoning alone, you could also argue that a con man who ran a lying scheme for 1 year and stole only like $20,000 should get life in prison -- after all, con men are pathological liars and that phenotype rarely changes all the way. And that seems too harsh?

I'm in two minds about it: On the one hand, I totally see the utilitarian argument of just locking up people who "lack a conscience" forever the first time they get caught for any serious crime. On the other hand, they didn't choose how they were born, and some people without prosocial system-1 emotions do in fact learn how to become a decent citizen. 

It seems worth mentioning that punishments for financial crime often include measures like "person gets banned from their industry" or them getting banned from participating in all kinds of financial schemes. In reality, the rules there are probably too lax and people who got banned in finance or pharma just transition to running crypto scams or sell predatory online courses on how to be successful (lol). But in theory, I like the idea of adding things to the sentencing that make re-offending less likely. This way, you can maybe justify giving people second chances. 

Comment by Lukas_Gloor on What are the strongest arguments for very short timelines? · 2024-12-23T16:00:31.450Z · LW · GW

Suppose that a researcher's conception of current missing pieces is a mental object M, their timeline estimate is a probability function P, and their forecasting expertise F is a function that maps M to P. In this model, F can be pretty crazy, creating vast differences in P depending how you ask, while M is still solid.

Good point. This would be reasonable if you think someone can be super bad at F and still great at M.

Still, I think estimating "how big is this gap?" and "how long will it take to cross it?" might quite related, so I expect the skills to be correlated or even strongly correlated.

Comment by Lukas_Gloor on What are the strongest arguments for very short timelines? · 2024-12-23T11:50:42.055Z · LW · GW

It surveyed 2,778 AI researchers who had published peer-reviewed research in the prior year in six top AI venues (NeurIPS, ICML, ICLR, AAAI, IJCAI, JMLR); the median time for a 50% chance of AGI was either in 23 or 92 years, depending on how the question was phrased.

Doesn't that discrepancy (how much answers vary between different ways of asking the question) tell you that the median AI researcher who published at these conferences hasn't thought about this question sufficiently and/or sanely?

It seems irresponsible to me to update even just a small bit to the specific reference class of which your above statement is true.

If you take people who follow progress closely and have thought more and longer about AGI as a research target specifically, my sense is that the ones who have longer timeline medians tend to say more like 10-20y rather than 23y+. (At the same time, there's probably a bubble effect in who I follow or talk to, so I can get behind maybe lengthening that range a bit.)

Doing my own reasoning, here are the considerations that I weigh heavily:  

  • we're within the human range of most skill types already (which is where many of us in the past would have predicted that progress speeds up, and don't see any evidence of anything that should change our minds on that past prediction – deep learning visibly hitting a wall would have been one conceivable way, but it hasn't happened yet)
  • that time for "how long does it take to cross and overshoot the human range at a given skill?" has historically gotten a lot smaller and is maybe even decreasing(?) (e.g., it admittedly took a long time to cross the human expert range in chess, but it took less long in Go, less long at various academic tests or essays, etc., to the point that chess certainly doesn't constitute a typical baseline anymore)
  • that progress has been quite fast lately, so that it's not intuitive to me that there's a lot of room left to go (sure, agency and reliability and "get even better at reasoning")
  • that we're pushing through compute milestones rather quickly because scaling is still strong with some more room to go, so on priors, the chance that we cross AGI compute thresholds during this scale-up is higher than that we'd cross it once compute increases slow down
  • that o3 seems to me like significant progress in reliability, one of the things people thought would be hard to make progress on

    Given all that, it seems obvious that we should have quite a lot of probability of getting to AGI in a short time (e.g., 3 years). Placing the 50% forecast feels less obvious because I have some sympathy for the view that says these things are notoriously hard to forecast and we should smear out uncertainty more than we'd intuitively think (that said, lately the trend has been that people consistently underpredict progress, and maybe we should just hard-update on that.) Still, even on that "it's prudent to smear out the uncertainty" view, let's say that implies that the median would be like 10-20 years away. Even then, if we spread out the earlier half of probability mass uniformly over those 10-20 years, with an added probability bump in the near-term because of the compute scaling arguments (we're increasing training and runtime compute now but this will have to slow down eventually if AGI isn't reached in the next 3-6 years or whatever), that IMO very much implies at least 10% for the next 3 years. Which feels practically enormously significant. (And I don't agree with smearing things out too much anyway, so my own probability is closer to 50%.) 

Comment by Lukas_Gloor on o3 · 2024-12-21T23:11:00.488Z · LW · GW

Well, the update for me would go both ways. 

On one side, as you point out, it would mean that the model's single pass reasoning did not improve much (or at all). 

On the other side, it would also mean that you can get large performance and reliability gains (on specific benchmarks) by just adding simple stuff. This is significant because you can do this much more quickly than the time it takes to train a new base model, and there's probably more to be gained in that direction – similar tricks we can add by hardcoding various "system-2 loops" into the AI's chain of thought and thinking process. 

You might reply that this only works if the benchmark in question has easily verifiable answers. But I don't think it is limited to those situations. If the model itself (or some subroutine in it) has some truth-tracking intuition about which of its answer attempts are better/worse, then running it through multiple passes and trying to pick the best ones should get you better performance even without easy and complete verifiability (since you can also train on the model's guesses about its own answer attempts, improving its intuition there).

Besides, I feel like humans do something similar when we reason: we think up various ideas and answer attempts and run them by an inner critic, asking "is this answer I just gave actually correct/plausible?" or "is this the best I can do, or am I missing something?."

(I'm not super confident in all the above, though.)

Lastly, I think the cost bit will go down by orders of magnitude eventually (I'm confident of that). I would have to look up trends to say how quickly I expect $4,000 in runtime costs to go down to $40, but I don't think it's all that long. Also, if you can do extremely impactful things with some model, like automating further AI progress on training runs that cost billions, then willingness to pay for model outputs could be high anyway. 

Comment by Lukas_Gloor on Fertility Roundup #4 · 2024-12-02T15:44:23.781Z · LW · GW

When the issue is climate change, a prevalent rationalist take goes something like this:

"Climate change would be a top priority if it weren't for technological progress. However, because technological advances will likely help us to either mitigate the harms from climate change or will create much bigger problems on their own, we probably shouldn't prioritize climate change too much." 

We could say the same thing about these trends of demographic aging that you highlight. So, I'm curious why you're drawn to this topic and where the normative motivation in your writing is coming from.

In the post, you use normative language like, "This suggests that we need to lower costs along many fronts of both money and time, and also we need to stop telling people to wait until they meet very high bars." (In the context of addressing people's cited reasons for why they haven't had kids – money, insecurity about money, not being able to affords kids or the house to raise them in, and mental health.) 

The way I conceptualize it, one can zoom in on different, plausibly-normatively-central elements of the situation:

(1) The perspective of existing people.

1a Nation-scale economic issues from an aging demographic, such as collapse of pension schemes, economic stagnation from the aging workforce, etc. 

1b Individual happiness and life satisfaction (e.g., a claim that having children tends to make people happier, also applying to parents 'on the margin,' people who, if we hadn't enouraged them, would have decided against children). 

(2) Some axiological perspective that considers the interests of both existing and newly created people/beings.

It seems uncontroversial that both 1a and 1b are important perspectives, but it's not obvious to me whether 1a is a practical priority for us in light of technological progress (cf the parallel to climate change) or how the empirics of 1b shake out (whether parents 'on the margin' are indeed happier). (I'm not saying 1b is necessarily controversial – for all I know, maybe the science already exists and is pretty clear. I'm just saying: I'm not personally informed on the topic even though I have read your series of posts on fertility.)

And then, (2) seems altogether subjective and controversial in the sense that smart people hold different views on whether it's all-things-considered good to encourage people to have lower standards for bringing new people into existence. Also, there are strong reasons (I've written up a thorough case for this here and here) why we shouldn't expect there to be an objective answer on "how to do axiology?."

This series would IMO benefit from a "Why I care about this?" note, because without it, I get the feeling of "Zvi is criticizing things government do/don't do in a way that might underhandedly bias readers into thinking that the implied normative views on population ethics are unquestioningly correct." The way I see it, governments are probably indeed behaving irrationally here given them not being bought into the prevalent rationalist worldview on imminent technological progress (and that's an okay thing to sneer at), but this doesn't mean that we have to go "boo!" to all things associated with not choosing children, and "yeah!" to all things associated with choosing them.

That said, I still found the specific information in these roundups interesting, since this is clearly a large societal trend and it's interesting to think through causes, implications, etc. 

Comment by Lukas_Gloor on AI #92: Behind the Curve · 2024-11-28T15:55:48.851Z · LW · GW

The tabletop game sounds really cool!

Interesting takeaways.

The first was exactly the above point, and that at some point, ‘I or we decide to trust the AIs and accept that if they are misaligned everyone is utterly f***ed’ is an even stronger attractor than I realized.

Yeah, when you say it like that... I feel like this is gonna be super hard to avoid!

The second was that depending on what assumptions you make about how many worlds are wins if you don’t actively lose, ‘avoid turning wins into losses’ has to be a priority alongside ‘turn your losses into not losses, either by turning them around and winning (ideal!) or realizing you can’t win and halting the game.’

There's also the option of, once you realize that winning is no longer achievable, trying to lose less badly than you could have otherwise. For instance, if out of all the trajectories where humans lose, you can guess that some of them seem more likely to bring about some extra bad dystopian scenario, you can try to prevent at least those. Some examples that I'm thinking of are AIs being spiteful or otherwise anti-social (on top of not caring about humans) or AIs being conflict-prone in AI-vs-AI interactions (including perhaps AIs aligned to alien civilizations). Of course, it may not be possible to form strong opinions over what makes for a better or worse "losing" scenario – if you remain very uncertain, all losing will seem roughly equally not valuable.

The third is that certain assumptions about how the technology progresses had a big impact on how things play out, especially the point at which some abilities (such as superhuman persuasiveness) emerge.

Yeah, but I like the idea of rolling dice for various options that we deem plausible (and having this built into the game). 

I'm curious to read takeaways from more groups if people continue to try this. Also curious on players' thoughts on good group sizes (how many people played at once and whether you would have preferred more or fewer players).

Comment by Lukas_Gloor on OpenAI Email Archives (from Musk v. Altman and OpenAI blog) · 2024-11-17T00:27:28.747Z · LW · GW

I agree that it sounds somewhat premature to write off Larry Page based on attitudes he had a long time ago, when AGI seemed more abstract and far away, and then not seek/try communication with him again later on. If that were Musk's true and only reason for founding OpenAI, then I agree that this was a communication fuckup.

However, my best guess is that this story about Page was interchangeable with a number of alternative plausible criticisms of his competition on building AGI that Musk would likely have come up with in nearby worlds. People like Musk (and Altman too) tend to have a desire to do the most important thing and the belief that they can do this thing a lot better than anyone else. On that assumption, it's not too surprising that Musk found a reason for having to step in and build AGI himself. In fact, on this view, we should expect to see surprisingly little sincere exploration of "joining someone else's project to improve it" solutions.

I don't think this is necessarily a bad attitude. Sometimes people who think this way are right in the specific situation. It just means that we see the following patterns a lot:

  • Ambitious people start their own thing rather than join some existing thing.
  • Ambitious people have fallouts with each other after starting a project together where the question of "who eventually gets de facto ultimate control" wasn't totally specified from the start. 

(Edited away a last paragraph that used to be here 50mins after posting. Wanted to express something like "Sometimes communication only prolongs the inevitable," but that sounds maybe a bit too negative because even if you're going to fall out eventually, probably good communication can help make it less bad.)

Comment by Lukas_Gloor on OpenAI Email Archives (from Musk v. Altman and OpenAI blog) · 2024-11-16T17:45:40.299Z · LW · GW

I thought the part you quoted was quite concerning, also in the context of what comes afterwards: 

Hiatus: Sam told Greg and Ilya he needs to step away for 10 days to think. Needs to figure out how much he can trust them and how much he wants to work with them. Said he will come back after that and figure out how much time he wants to spend.

Sure, the email by Sutskever and Brockman gave some nonviolent communication vibes and maybe it isn't "the professional thing" to air one's feelings and perceived mistakes like that, but they seemed genuine in what they wrote and they raised incredibly important concerns that are difficult in nature to bring up. Also, with hindsight especially, it seems like they had valid reasons to be concerned about Altman's power-seeking tendencies!

When someone expresses legitimate-given-the-situation concerns about your alignment and your reaction is to basically gaslight them into thinking they did something wrong for finding it hard to trust you, and then you make it seem like you are the poor victim who needs 10 days off of work to figure out whether you can still trust them, that feels messed up! (It's also a bit hypocritical because the whole "I need 10 days to figure out if I can still trust you for thinking I like being CEO a bit too much," seems childish too.) 

(Of course, these emails are just snapshots and we might be missing things that happened in between via other channels of communication, including in-person talks.)

Also, I find it interesting that they (Sutskever and Brockman) criticized Musk just as much as Altman (if I understood their email correctly), so this should make it easier for Altman to react with grace. I guess given Musk's own annoyed reaction, maybe Altman was calling the others' email childish to side with Musks's dismissive reaction to that same email.

Lastly, this email thread made me wonder what happened between Brockman and Sutskever in the meantime, since it now seems like Brockman no longer holds the same concerns about Altman even though recent events seem to have given a lot of new fire to them.

Comment by Lukas_Gloor on Poker is a bad game for teaching epistemics. Figgie is a better one. · 2024-07-13T16:05:01.155Z · LW · GW

Some of the points you make don't apply to online poker. But I imagine that the most interesting rationality lessons from poker come from studying other players and exploiting them, rather than memorizing and developing an intuition for the pure game theory of the game. 

  • If you did want to focus on the latter goal, you can play online poker (many players can >12 tables at once) and after every session, run your hand histories through a program (e.g., "GTO Wizard") that will tell you where you made mistakes compared to optimal strategy, and how much they would cost you against an optimal-playing opponent. Then, for any mistake, you can even input the specific spot into the trainer program and practice it with similar hands 4-tabling against the computer, with immediate feedback every time on how you played the spot. 
Comment by Lukas_Gloor on An AI Race With China Can Be Better Than Not Racing · 2024-07-03T01:09:22.247Z · LW · GW

It seems important to establish whether we are in fact going to be in a race and whether one side isn't already far ahead.

With racing, there's a difference between optimizing the chance of winning vs optimizing the extent to which you beat the other party when you do win. If it's true that China is currently pretty far behind, and if TAI timelines are fairly short so that a lead now is pretty significant, then the best version of "racing" shouldn't be "get to the finish line as fast as possible." Instead, it should be "use your lead to your advantage." So, the lead time should be used to reduce risks.

Not sure this is relevant to your post in particular; I could've made this point also in other discussions about racing. Of course, if a lead is small or non-existent, the considerations will be different.

Comment by Lukas_Gloor on Population ethics and the value of variety · 2024-06-27T00:47:43.907Z · LW · GW

I wrote a long post last year saying basically that.

Comment by Lukas_Gloor on Suffering Is Not Pain · 2024-06-19T11:50:51.029Z · LW · GW

Even if attaining a total and forevermore cessation of suffering is substantially more difficult/attainable by substantially fewer people in one lifetime, I don't think it's unreasonable to think that most people could suffer at least 50 percent less with dedicated mindfulness practice. I'm curious as to what might feed an opposing intuition for you! I'd be quite excited about empirical research that investigates the tractability and scalability of meditation for reducing suffering, in either case.

My sense is that existing mindfulness studies don't show the sort of impressive results that we'd expect if this were a great solution.

Also, I think people who would benefit most from having less day-to-day suffering often struggle with having no "free room" available for meditation practice, and that seems like an issue that's hard to overcome even if meditation practice would indeed help them a lot.

It's already sign of having a decently good life when you're able to start dedicating time for something like meditation, which I think requires a bit more mental energy than just watching series or scrolling through the internet. A lot of people have leisure time, but it's a privilege to be mentally well off enough to do purposeful activities during your leisure time. The people who have a lot of this purposeful time probably (usually) aren't among the ones that suffer most (whereas the people who don't have it will struggle sticking to regular meditation practice, for good reasons).

For instance, if someone has a chronic illness with frequent pain and nearly constant fatigue, I can see how it might be good for them to practice meditation for pain management, but higher up on their priority list are probably things like "how do I manage to do daily chores despite low energy levels?" or "how do I not get let go at work?."

Similarly, for other things people may struggle with (addictions, financial worries, anxieties of various sorts; other mental health issues), meditation is often something that would probably help, but it doesn't feel like priority number one for people with problem-ridden, difficult lives. It's pretty hard to keep up motivation for training something that you're not fully convinced of it being your top priority, especially if you're struggling with other things.

I see meditation as similar to things like "eat healthier, exercise more, go to sleep on time and don't consume distracting content or too much light in the late evenings, etc." And these things have great benefits, but they're also hard, so there are no low-hanging fruit and interventions in this space will have limited effectiveness (or at least limited cost-effectiveness; you could probably get quite far if you gifted people their private nutritionist cook, fitness trainer and motivator, house cleaner and personal assistant, meditation coach, give them enough money for financial independence, etc.).

And then the people who would have enough "free room" to meditate may be well off enough to not feel like they need it? In some ways, the suffering of a person who is kind of well off in life isn't that bad and instead of devoting 1h per day for meditation practice to reduce the little suffering that they have, maybe the well-off person would rather take Spanish lessons, or train for a marathon, etc.

(By the way, would it be alright if I ping you privately to set up a meeting? I've been a fan of your writing since becoming familiar with you during my time at CLR and would love a chance to pick your brain about SFE stuff and hear about what you've been up to lately!)

I'll send you a DM!

Comment by Lukas_Gloor on Suffering Is Not Pain · 2024-06-18T19:11:41.777Z · LW · GW

[...] I am certainly interested to know if anyone is aware of sources that make a careful distinction between suffering and pain in arguing that suffering and its reduction is what we (should) care about.

I did so in my article on Tranquilism, so I broadly share your perspective!

I wouldn't go as far as what you're saying in endnote 9, though. I mean, I see some chance that you're right in the impractical sense of, "If someone gave up literally all they cared about in order to pursue ideal meditation training under ideal circumstances (and during the training they don't get any physical illness issues or otherwise have issues crop up that prevent successfully completion of the training), then they could learn to control their mental states and avoid nearly all future sources of suffering." But that's pretty impractical even if true!

It's interesting, though, what you say about CBT. I agree it makes sense to be accurate about these distinctions, and that it could affect specific interventions (though maybe not at the largest scale of prioritization, the way I see the landscape).

Comment by Lukas_Gloor on Matthew Barnett's Shortform · 2024-06-17T10:21:01.909Z · LW · GW

This would be a valid rebuttal if instruction-tuned LLMs were only pretending to be benevolent as part of a long-term strategy to eventually take over the world, and execute a treacherous turn. Do you think present-day LLMs are doing that? (I don't)

Or that they have a sycophancy drive. Or that, next to "wanting to be helpful," they also have a bunch of other drives that will likely win over the "wanting to be helpful" part once the system becomes better at long-term planning and orienting its shards towards consequentialist goals. 

On that latter model, the "wanting to be helpful" is a mask that the system is trained to play better and better, but it isn't the only thing the system wants to do, and it might find that once its gets good at trying on various other masks to see how this will improve its long-term planning, it for some reason prefers a different "mask" to become its locked-in personality. 

Comment by Lukas_Gloor on MIRI 2024 Communications Strategy · 2024-06-01T14:28:15.966Z · LW · GW

I thought the first paragraph and the boldened bit of your comment seemed insightful. I don't see why what you're saying is wrong – it seems right to me (but I'm not sure).

Comment by Lukas_Gloor on MIRI 2024 Communications Strategy · 2024-06-01T14:20:00.790Z · LW · GW

I am not convinced MIRI has given enough evidence to support the idea that unregulated AI will kill everyone and their children.

The way you're expressing this feels like an unnecessarily strong bar. 

I think advocacy for an AI pause already seems pretty sensible to me if we accept the following premises: 

  • The current AI research paradigm mostly makes progress in capabilities before progress in understanding. 
    (This puts AI progress in a different reference class from most other technological progress, so any arguments with base rates from "technological progress normally doesn't kill everyone" seem misguided.)
  • AI could very well kill most of humanity, in the sense that it seems defensible to put this at anywhere from 20-80% (we can disagree on the specifics of that range, but that's where I'd put it looking at the landscape of experts who seem to be informed and doing careful reasoning (so not LeCun)).  
  • If we can't find a way to ensure that TAI is developed by researchers and leaders who act with a degree of responsibility proportional to the risks/stakes, it seems better to pause.
     

Edited to add the following: 
There's also a sense in which whether to pause is quite independent from the default risk level. Even if the default risk were only 5%, if there were a solid and robust argument that pausing for five years will reduce it to 4%, that's clearly very good! (It would be unfortunate for the people who will die preventable deaths in the next five years, but it still helps overall more people to pause under these assumptions.) 

Comment by Lukas_Gloor on MIRI 2024 Communications Strategy · 2024-06-01T13:54:05.057Z · LW · GW

Would most existing people accept a gamble with 20% of chance of death in the next 5 years and 80% of life extension and radically better technology? I concede that many would, but I think it's far from universal, and I wouldn't be too surprised if half of people or more think this isn't for them.

I personally wouldn't want to take that gamble (strangely enough I've been quite happy lately and my life has been feeling meaningful, so the idea of dying in the next 5 years sucks).

(Also, I want to flag that I strongly disagree with your optimism.)
 

Comment by Lukas_Gloor on OpenAI: Helen Toner Speaks · 2024-05-31T00:34:44.316Z · LW · GW

we have found Mr Altman highly forthcoming

That's exactly the line that made my heart sink.

I find it a weird thing to choose to say/emphasize.

The issue under discussion isn't whether Altman hid things from the new board; it's whether he hid things to the old board a long while ago.

Of course he's going to seem forthcoming towards the new board at first. So, the new board having the impression that he was forthcoming towards them? This isn't information that helps us much in assessing whether to side with Altman vs the old board. That makes me think: why report on it? It would be a more relevant update if Taylor or Summers were willing to stick their necks out a little further and say something stronger and more direct, something more in the direction of (hypothetically), "In all our by-now extensive interactions with Altman, we got the sense that he's the sort of person you can trust; in fact, he had surprisingly circumspect and credible things to say about what happened, and he seems self-aware about things that he could've done better (and those things seem comparatively small or at least very understandable)." If they had added something like that, it would have been more interesting and surprising. (At least for those who are currently skeptical or outright negative towards Altman; but also "surprising" in terms of "nice, the new board is really invested in forming their own views here!"). 

By contrast, this combination of basically defending Altman (and implying pretty negative things about Toner and McCauley's objectivity and their judgment on things that they deem fair to tell the media), but doing so without sticking their necks out, makes me worried that the board is less invested in outcomes and more invested in playing their role. By "not sticking their necks out," I mean the outsourcing of judgment-forming to the independent investigation and the mentioning of clearly unsurprising and not-very-relevant things like whether Altman has been forthcoming to them, so far. By "less invested in outcomes and more invested in playing their role," I mean the possibility that the new board maybe doesn't consider it important to form opinions at the object level (on Altman's character and his suitability for OpenAI's mission, and generally them having a burning desire to make the best CEO-related decisions). Instead, the alternative mode they could be in would be having in mind a specific "role" that board members play, which includes things like, e.g., "check whether Altman ever gets caught doing something outrageous," "check if he passes independent legal reviews," or "check if Altman's answers seem reassuring when we occasionally ask him critical questions." And then, that's it, job done. If that's the case, I think that'd be super unfortunate. The more important the org, the more it matters to have a engaged/invested board that considers itself ultimately responsible for CEO-related outcomes ("will history look back favorably on their choices regarding the CEO").

To sum up, I'd have much preferred it if their comments had either included them sticking their neck out a little more, or if I had gotten from them more of a sense of still withholding judgment. I think the latter would have been possible even in combination with still reminding the public that Altman (e.g.,) passed that independent investigation or that some of the old board members' claims against him seem thinly supported, etc. (If that's their impression, fair enough.) For instance, it's perfectly possible to say something like, "In our duty as board members, we haven't noticed anything unusual or worrisome, but we'll continue to keep our eyes open." That's admittedly pretty similar, in substance, to what they actually said. Still, it would read as a lot more reassuring to me because of its different emphasis My alternative phrasing would help convey that (1) they don't naively believe that Altman – in worlds where he is dodgy – would have likely already given things away easily in interactions towards them, and (2) that they consider themselves responsible for the outcome (and not just following of the common procedures) of whether OpenAI will be led well and in line with its mission.
(Maybe they do in fact have these views, 1 and 2, but didn't do a good job here at reassuring me of that.)

Comment by Lukas_Gloor on OpenAI: Fallout · 2024-05-30T00:53:22.912Z · LW · GW

Followed immediately by: 

I too also have very strong concerns that we are putting a person whose highest stats are political maneuvering and deception, who is very high in power seeking, into this position. By all reports, you cannot trust what this man tells you.

Comment by Lukas_Gloor on Stephen Fowler's Shortform · 2024-05-18T16:01:29.364Z · LW · GW

For me, the key question in situations when leaders made a decision with really bad consequences is, "How did they engage with criticism and opposing views?"

If they did well on this front, then I don't think it's at all mandatory to push for leadership changes (though certainly, the worse someones track record gets, the more that speaks against them).

By contrast, if leaders tried to make the opposition look stupid or if they otherwise used their influence to dampen the reach of opposing views, then being wrong later is unacceptable.

Basically, I want to allow for a situation where someone was like, "this is a tough call and I can see reasons why others wouldn't agree with me, but I think we should do this," and then ends up being wrong, but I don't want to allow situations where someone is wrong after having expressed something more like, "listen to me, I know better than you, go away."

In the first situation, it might still be warranted to push for leadership changes (esp. if there's actually a better alternative), but I don't see it as mandatory

The author of the original short form says we need to hold leaders accountable for bad decisions because otherwise the incentives are wrong. I agree with that, but I think it's being too crude to tie incentives to whether a decision looks right or wrong in hindsight. We can do better and evaluate how someone went about making a decision and how they handled opposing views. (Basically, if opposing views aren't loud enough that you'd have to actively squish them using your influence illegitimately, then the mistake isn't just yours as the leader; it's also that the situation wasn't significantly obvious to others around you.) I expect that everyone who has strong opinions on things and is ambitious and agenty in a leadership position is going to make some costly mistakes. The incentives shouldn't be such that leaders shy away from consequential interventions.

Comment by Lukas_Gloor on Ilya Sutskever and Jan Leike resign from OpenAI [updated] · 2024-05-15T15:17:43.908Z · LW · GW

I agree with what you say in the first paragraph. If you're talking about Ilya, which I think you are, I can see what you mean in the second paragraph, but I'd flag that even if he had some sort of plan here, it seems pretty costly and also just bad norms for someone with his credibility to say something that indicates that he thinks OpenAI is on track to do well at handling their great responsibility, assuming he were to not actually believe this. It's one thing to not say negative things explicitly; it's a different thing to say something positive that rules out the negative interpretations. I tend to take people at their word if they say things explicitly, even if I can assume that they were facing various pressures. If I were to assume that Ilya is saying positive things that he doesn't actually believe, that wouldn't reflect well on him, IMO. 

If we consider Jan Leike's situation, I think what you're saying applies more easily, because him leaving without comment already reflects poorly on OpenAI's standing on safety, and maybe he just decided that saying something explicitly doesn't really add a ton of information (esp. since maybe there are other people who might be in a better position to say things in the future). Also, I'm not sure it affects future employment prospects too much if someone leaves a company, signs a non-disparagement agreement, and goes "no comment" to indicate that there was probably dissatisfaction with some aspects of the company. There are many explanations for this and if I was making hiring decisions at some AI company, even if it's focused on profits quite a bit, I wouldn't necessarily interpret this as a negative signal. 

That said, signing non-disparagament agreements certainly feels like it has costs and constrains option value, so it seems like a tough choice.

Comment by Lukas_Gloor on Ilya Sutskever and Jan Leike resign from OpenAI [updated] · 2024-05-15T13:00:28.010Z · LW · GW

It seems likely (though not certain) that they signed non-disparagement agreements, so we may not see more damning statements from them even if that's how they feel. Also, Ilya at least said some positive things in his leaving announcement, so that indicates either that he caved in to pressure (or too high agreeableness towards former co-workers) or that he's genuinely not particularly worried about the direction of the company and that he left more because of reasons related to his new project.