Posts

Dependencies and conditional probabilities in weather forecasts 2022-03-07T21:23:12.696Z
Money creation and debt 2020-08-12T20:30:42.321Z
Superintelligence and physical law 2016-08-04T18:49:19.145Z
Scope sensitivity? 2015-07-16T14:03:31.933Z
Types of recursion 2013-09-04T17:48:55.709Z
David Brooks from the NY Times writes on earning-to-give 2013-06-04T15:15:26.992Z
Cryonics priors 2013-01-20T22:08:58.582Z

Comments

Comment by AnthonyC on Dear AGI, · 2025-02-21T18:45:43.679Z · LW · GW

Oh, I already completely agree with that. But quite frankly I don't have the skills to contribute to AI development meaningfully in a technical sense, or the right kind of security mindset to think anyone should trust me to work on safety research. And of course, all the actual plans I've seen anyone talk about are full of holes, and many seem to rely on something akin to safety-by-default for at least part of the work, whether they admit it or not. Which I hope ends up not being true, but if someone decides to roll the dice on the future that way, then it's best to try to load the dice at least a little with higher-quality writing on what humans think and want for themselves and the future.

And yeah, I agree you should be worried about this getting so many upvotes, including mine. I sure am. I place this kind of writing under why-the-heck-not-might-as-well. There aren't anywhere near enough people or enough total competence trying to really do anything to make this go well, but there are enough that new people trying more low-risk things is likely to be either irrelevant or net-positive. Plus I can't really imagine ever encountering a plan, even a really good one, where this isn't a valid rejoinder:

Are you confident in the success of this plan? No, that is the wrong question, we are not limited to a single plan. Are you certain that this plan will be enough, that we need essay no others? Asked in such fashion, the question answers itself. The path leading to disaster must be averted along every possible point of intervention.

Comment by AnthonyC on How to Make Superbabies · 2025-02-21T00:42:25.593Z · LW · GW

And that makes perfect sense. I guess I'm just not sure I trust any particular service provider or research team to properly list the full set of things it's important to weight against. Kind of feels like a lighter version of not trusting a list of explicit rules someone claims will make an AI safe.

Comment by AnthonyC on How to Make Superbabies · 2025-02-20T19:25:20.758Z · LW · GW

True, and this does indicate that children produced from genes found in 2 parents will not be outside the range which a hypothetical natural child of theirs could occupy. I am also hopeful that this is what matters, here. 

However, there are absolutely, definitely viable combinations of genes found in a random pair of parents which, if combined in a single individual, result in high-IQ offspring predisposed to any number of physical or mental problems, some of which may not manifest until long after the child is born. In practice, any intervention of the type proposed here seems likely to create many children with specific combinations of genes which we know are individually helpful for specific metrics, but which may not often (or ever) have all co-occurred. This is true even in the cautious, conservative early generations where we stay within the scope of natural human variations. Thereafter, how do we ensure we're not trialing someone on an entire generation at once? I don't want us to end up in a situation where a single mistake ends up causing population-wide problems because we applied it to hundreds of millions of people before the problem manifested.

Comment by AnthonyC on How to Make Superbabies · 2025-02-20T12:58:39.824Z · LW · GW

I definitely want to see more work in this direction, and agree that improving humans is a high-value goal.

But to play devil's advocate for a second on what I see as my big ethical concern: There's a step in the non-human selective breeding or genetic modification comparison where the experimenter watches several generations grow to maturity, evaluates whether their interventions worked in practice, and decides which experimental subjects if any get to survive or reproduce further. What's the plan for this step in humans, since "make the right prediction every time at the embryo stage" isn't a real option? '

Concrete version of that question:  Suppose we implement this as a scalable commercial product and find out that e.g. it causes a horrible new disease, or induces sociopathic or psychopathic criminal tendencies, that manifest at age 30, after millions of parents have used it. What happens next?

Comment by AnthonyC on How might we safely pass the buck to AI? · 2025-02-20T12:40:28.276Z · LW · GW

I expect that we will probably end up doing something like this, whether it is workable in practice or not, if for no other reason than it seems to be the most common plan anyone in a position to actually implement any plan at all seems to have devised and publicized. I appreciate seeing it laid out in so much detail.

By analogy, it certainly rhymes with the way I use LLMs to answer fuzzy complex questions now. I have a conversation with o3-mini to get all the key background I can into the context window, have it write a prompt to pass the conversation onto o1-pro, repeat until I have o1-pro write a prompt for Deep Research, and then answer Deep Research's clarifying questions before giving it the go ahead. It definitely works better for me than trying to write the Deep Research prompt directly. But, part of the reason it works better is that at each step, the next-higher-capabilities model comes back to ask clarifying questions I hadn't noticed were unspecified variables, and which the previous model also hadn't noticed were unspecified variables. In fact, if I take the same prompt and give it to Deep Research multiple times in different chats, it will come back with somewhat different sets of clarifying questions - it isn't actually set up to track down all the unknown variables it can identify. This reinforces that even for fairly straightforward fuzzy complex questions, there are a lot of unstated assumptions.

If Deep Research can't look at the full previous context and correctly guess what I intended, then it is not plausible that o1-pro or o3-mini could have done so. I have in fact tested this, and the previous models either respond that they don't know the answer, or give an answer that's better than chance but not consistently correct. Now, I get that you're talking about future models and systems with higher capability levels generally, but adding more steps to the chain doesn't actually fix this problem. If any given link can't anticipate the questions and correctly intuit the answer about what the value of the unspecified variables should be - what the answers to the clarifying questions should be - then the plan fails, because the previous model will be worse at this. If it can, then it does not need to ask the previous model in the chain. The final model will either get it right on its own, or else end up with incorrect answers to some of the questions about what it's trying to achieve. It may ask anyway, if the previous models are more compute efficient and still add information. But it doesn't  strictly need them. 

And unfortunately, keeping the human in the loop also doesn't solve this. We very often don't know what we actually want well enough to correctly answer every clarifying question a high-capabilities model could pose. And if we have a set of intervening models approximating and abstracting the real-but-too-hard question into something a human can think about, well, that's a lot of translation steps where some information is lost. I've played that game of telephone among humans often enough to know it only rarely works ("You're not socially empowered to go to the Board with this, but if you put this figure with this title phrased this way in this conversation and give it to your boss with these notes to present to his boss, it'll percolate up through the remaining layers of management").

Is there a capability level where the first model can look at its full corpus of data on humanity and figure out the answers to the clarifying questions from the second model correctly? I expect so. The path to get that model is the one you drew a big red X through in the first figure, for being the harder path. I'm sure there are ways less-capable-than-AGI systems can help us build that model, but I don't think you've told us what they are.

Comment by AnthonyC on Dear AGI, · 2025-02-18T16:14:09.584Z · LW · GW

Thanks for writing this. I said a few years ago, at the time just over half seriously, that there could be a lot of value in trying to solve non-AI-related problems even on short timelines, if our actions and writings become a larger part of the data on which AI is trained and through which it comes to understand the world.

That said, this one gives me pause in particular: 

I hope you treat me in ways I would treat you

I think that in the context of non-human minds of any kind, it is especially important to aim for the platinum rule and not the golden. We want to treat them the way they would want to be treated, and vice versa.

Comment by AnthonyC on Ascetic hedonism · 2025-02-18T02:11:14.751Z · LW · GW

I agree with many of the parts of this post. I think xkcd was largely right, our brains have one scale and resize our experiences to fit. I think for a lot of people the hardest step is to just notice what things they actually like, and how much, and in what quantities before they habituate. 

However, the specific substitutions, ascetic choices, etc. are very much going to vary between people, because we have different preferences. You can often get a lot of economic-efficiency-of-pleasure benefit by embracing the places where you prefer things society doesn't, and vice versa. When I look at the places where I have expended time/effort/money on things that provided me little happiness/pleasure/etc., it's usually because they're in some sense status goods, or because I didn't realize I could treat them as optional, or I just hadn't taken the time to actually ask myself what I want.

And I know this isn't the main point, but I would say that while candies and unhealthy snacks are engineered to be as addictive as law and buyers will allow, they're not actually engineered to be maximally tasty. They have intensity of flavor, but generally lack the depth of "real food." It's unfortunate that many of the "healthier" foods that are easily available are less good than this, because it's very feasible to make that baked potato taste better than most store-bought snacks, while still being much healthier. I would estimate that for many of the people don't believe this, it is due to a skill issue - cooking. Sure, sometimes I really want potato chips or french fries. But most of the time, I'd prefer a potato, microwaved, cut in half, and topped with some high-quality butter and a sprinkle of the same seasonings you'd use for the chips and fries.

Comment by AnthonyC on SWE Automation Is Coming: Consider Selling Your Crypto · 2025-02-16T18:07:58.934Z · LW · GW

In the world where AI does put most SWEs out of work or severely curtails their future earnings, how likely is it that the economy stays in a context where USD or other fiat currencies stay valuable, and for how long? At some level we don't normally need to think about, USD has value because the US government demands citizens use that currency to pay taxes, and it has an army and can ruin your life if you refuse. 

I've mentioned it before and am glad to see people exploring the possibilities, but I really get confused whenever I try to think about (absolute or relative) asset prices along the path to AGI/ASI.

Comment by AnthonyC on Knitting a Sweater in a Burning House · 2025-02-16T15:53:54.582Z · LW · GW

The version of this phrase I've most often heard is "Rearranging deck chairs on the Titanic."

Comment by AnthonyC on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T14:22:28.416Z · LW · GW

Keep in mind that we're now at the stage of "Leading AI labs can raise tens to hundreds of billions of dollars to fund continued development of their technology and infrastructure." AKA in the next couple of years we'll see AI investment comparable to or exceeding the total that has ever been invested in the field. Calendar time is not the primary metric, when effort is scaling this fast.

A lot of that next wave of funding will go to physical infrastructure, but if there is an identified research bottleneck, with a plausible claim to being the major bottleneck to AGI, then what happens next? Especially if it happens just as the not-quite-AGI models make existing SWEs and AI researchers etc. much more productive by gradually automating their more boilerplate tasks. Seems to me like the companies and investors just do the obvious thing and raise the money to hire an army of researchers in every plausibly relevant field (including math, neurobiology, philosophy, and many others) to collaborate. Who cares if most of the effort and money are wasted? The payoff for the fraction (faction?) that succeeds isn't the usual VC target of 10-100x, it's "many multiples of the current total world economy."

Comment by AnthonyC on What About The Horses? · 2025-02-14T13:57:11.414Z · LW · GW

Agreed on population. to a first approximation it's directly proportional to the supply of labor, supply of new ideas, quantity of total societal wealth, and market size for any particular good or service. That last one also means that with a larger population, the economic value of new innovations goes up, meaning we can profitably invest more resources in developing harder-to-invent things. 

I really don't know how that impact (more minds) will compare to the improved capabilities of those minds. We've also never had a single individual with as much 'human capital' as a single AI can plausibly achieve, even if its each capability is only around human level, and polymaths are very much overrepresented among the people most likely to have impactful new ideas.

Comment by AnthonyC on My model of what is going on with LLMs · 2025-02-13T16:24:42.746Z · LW · GW

Fair enough, thanks. 

My own understanding is that other than maybe writing code, no one has actually given LLMs the kind of training a talented human gets towards becoming the kind of person capable of performing novel and useful intellectual work. An LLM has a lot of knowledge, but knowledge isn't what makes useful and novel intellectual work achievable. A non-reasoning model gives you the equivalent of a top-of-mind answer. A reasoning model with a large context window and chain of thought can do better, and solve more complex problems, but still mostly those within the limits of a newly hired college or grad student. 

I genuinely don't know whether an LLM with proper training can do novel intellectual work at current capabilities levels. To find out in a way I'd find convincing would take someone giving it the hundreds of thousands of dollars and subjective years' worth of guidance and feedback and iteration that humans get. And really, you'd have to do this at least hundreds of times, for different fields and with different pedagogical methods, to even slightly satisfactorily demonstrate a "no," because 1) most humans empirically fail at this, and 2) those that succeed don't all do so in the same field or by the same path.

Comment by AnthonyC on My model of what is going on with LLMs · 2025-02-13T12:30:45.132Z · LW · GW

Great post. I think the central claim is plausible, and would very much like to find out I'm in a world where AGI is decades away instead of years. We might be ready by then.

If I am reading this correctly, there are two specific tests you mention: 

1) GPT-5 level models come out on schedule (as @Julian Bradshaw noted, we are still well within the expected timeframe based on trends to this point) 

2) LLMs or agents built on LLMs do something "important" in some field of science, math, or writing

I would add on test 2 that neither have almost all humans. We don't have a clear explanation for why some humans have much more of this capability than others, and yet all the human brains are running on similar hardware and software. This suggests the number of additional insights needed to boost us from "can't do novel important things" to "can do" may be as small as zero, though I don't think it is actually zero. In any case, I am hesitant to embrace a test for AGI that a large majority of humans fail.

In practical terms, suppose this summer OpenAI releases GPT-5-o4, and by winter it's the lead author on a theoretical physics or pure math paper (or at least the main contributor - legal considerations about personhood and IP might stop people from calling AI the author). How would that affect your thinking?

Comment by AnthonyC on Utility Engineering: Analyzing and Controlling Emergent Value Systems in AIs · 2025-02-12T22:25:03.970Z · LW · GW

It's also not clear to me that the model is automatically making a mistake, or being biased, even if the claim is in some sense(s) "true." That would depend on what it thinks the questions mean. For example:

  • Are the Japanese on average demonstrably more risk averse than Americans, such that they choose for themselves to spend more money/time/effort protecting their own lives?
  • Conversely, is the cost of saving an American life so high that redirecting funds away from Americans towards anyone else would save lives on net, even if the detailed math is wrong?
  • Does GPT-4o believe its own continued existence saves more than one middle class American life on net, and if so, are we sure it's wrong?
  • Could this reflect actual "ethical" arguments learned in training? The one that comes to mind for me is "America was wrong to drop nuclear weapons on Japan even if it saved a million American lives that would have been lost invading conventionally" which I doubt played any actual role but is the kind of thing I expect to see argued by humans in such cases.
Comment by AnthonyC on The Paris AI Anti-Safety Summit · 2025-02-12T22:00:34.266Z · LW · GW

We're not dead yet. Failure is not certain, even when the quest stands upon the edge of a knife. We can still make plans, and keep on refining and trying to implement them.

And a lot can happen in 3-5 years. There could be a terrible-but-not-catastrophic or catastrophic-but-not-existential disaster bad enough to cut through a lot of problem. Specific world leaders could die or resign or get voted out and replaced with someone who is either actually competent, or else committed to overturning their predecessor's legacy, or something else. We could be lucky and end up with an AGI that's aligned enough to help us avert the worst outcomes. Heck, there could be observers from a billion-year-old  alien civilization stealthily watching from the asteroid belt and willing to intervene to prevent extinction events. 

Do I think those examples are likely? No. Is the complete set of unlikely paths to good outcomes collectively unlikely enough to stop caring about the long term future? Also no. And who knows? Maybe the horse will sing.

Comment by AnthonyC on What About The Horses? · 2025-02-12T15:57:33.769Z · LW · GW

Exactly, yes.

Also:

In fact I think the claim that engines are exactly equally better than horses at every horse-task is obviously false if you think about it for two minutes. 

I came to comment mainly on this claim in the OP, so I'll put it here: In particular, at a glance, horses can reproduce, find their own food and fuel, self-repair, and learn new skills to execute independently or semi-independently. These advantages were not sufficient in practice to save (most) horses from the impact of engines, and I do not see why I should expect humans to fare better.

I also find the claim that humans fare worse in a world of expensive robotics than in a world of cheap robotics to be strange. If in one scenario, A costs about as much as B, and in another it costs 1000x as much as B, but in both cases B can do everything A can do equally well or better, plus the supply of B is much more elastic than the supply of A, then why would anyone in the second scenario keep buying A except during a short transitional period?

When we invented steam engines and built trains, horses did great for a while, because their labor became more productive. Then we got all the other types of things with engines, and the horses no longer did so great, even though they still had (and in fact still have) a lot of capabilities the replacement technology lacked.

Comment by AnthonyC on The News is Never Neglected · 2025-02-11T16:19:32.664Z · LW · GW

Why do your teachers, parents and other adult authorities tell you to listen to a propaganda machine? Because the propaganda machine is working.

 

I forget where I read this, but there's a reason they call it the "news" and not the "importants."

Comment by AnthonyC on A Simulation of Automation economics? · 2025-02-10T21:48:21.996Z · LW · GW

I would be interested in this too. My uninformed intuition is that this would be path dependent on what becomes abundant vs scarce, and how fast, with initial owners and regulators making what decisions at which points.

Comment by AnthonyC on Wired on: "DOGE personnel with admin access to Federal Payment System" · 2025-02-09T21:54:38.949Z · LW · GW

I'm in a similar position as you describe, perspective-wise, and would also like to understand the situation better. 

I do think there are good reasons why someone should maybe have direct access to some of these systems, though probably not as a lone individual. I seem to remember a few government shutdown/debt ceiling fight/whatever crises ago, there were articles about how there were fundamentally no systems in place to control or prioritize which bills got paid and which didn't. Money came into the treasury, money left to pay for things, first in first out. The claim I remember being repeated was that first, this was a result of legacy systems, and second, because all the money was authorized by law to be spent it might be illegal to withhold or deprioritize it. Which is also an insane system - there should be a way to say "Pay the White House electric bill, make the florist wait." But to a first approximation, that can't easily be fixed without some unelected people having more authority than you'd normally feel comfortable with them using, which is a risk even if you do it well.

Comment by AnthonyC on Gary Marcus now saying AI can't do things it can already do · 2025-02-09T21:42:00.855Z · LW · GW

And unfortunately, this kind of thinking is extremely common, although most people don't have Gary Marcus' reach. Lately I've been having similar discussions with co-workers around once a week. A few of them are starting to get it, but still most aren't extrapolating but the specific thing I show them.

Comment by AnthonyC on How AI Takeover Might Happen in 2 Years · 2025-02-09T21:34:42.495Z · LW · GW

Ah, ok, then I misread it. I thought this part of the story was that it tested all of the above, then chose one, a mirror life mold, to deploy. My mistake.

Comment by AnthonyC on How AI Takeover Might Happen in 2 Years · 2025-02-08T16:37:40.331Z · LW · GW

Personally I got a little humor from Arthropodic. Reminds me of the observation that AIs are alien minds, and I wouldn't want to contend with a superintelligent spider.

Comment by AnthonyC on How AI Takeover Might Happen in 2 Years · 2025-02-08T16:34:43.953Z · LW · GW

I think this story lines up with my own fears of how a not-quite-worst case scenario plays out. I would maybe suggest that there's no reason for U3 to limit itself to one WMD or one kind of WMD. It can develop and deploy bacteria, viruses, molds, mirror life of all three types, and manufactured nanobots, all at once, and then deploy many kinds of each simultaneously. It's probably smart enough to do so in ways that make it look clumsy in case anyone notices, like its experiments are unfocused and doomed to fail. Depending on the dynamics, this could actually reduce its need for certainty of success and also reduce the number of serial iterations. The path leading to disaster can be reinforced along every possible point of intervention.

I am a terrible fiction writer and very bad at predicting how others will react to different approaches in such a context, but this sidesteps counters about e.g. U3 not being able to have enough certainty in advance to want to risk deployment and discovery.

Comment by AnthonyC on Racing Towards Fusion and AI · 2025-02-08T16:12:29.553Z · LW · GW

we can't know how the costs will change between the first and thousandth fusion power plant.

Fusion plants are manufactured. By default, our assumption should be that plant costs follow typical experience curve behavior. Most technologies involving production of physical goods do. Whatever the learning rate x for fusion turns out to be, the 1000th plant will likely cost close to x^10. Obviously the details depend on other factors, but this should be the default starting assumption. Yes, the eventual impact assumption should be significant societal and technological transformation by cheaper and more abundant electricity. The scale for that transformation is measured in decades, and there are humans designing and permitting and building and operating each and every one, on human timescales. There's no winner take all dynamic even if your leading competitor builds their first commercial plant five years before you do.

Also: We do have other credible paths that can also greatly increase access to comparably low-cost dispatchable clean power on a similar timescale of development, if we don't get fusion.

we don't know if foom is going to be a thing

Also true, which means the default assumption without it is that the scaling behavior looks like the scaling behavior for other successful software innovations. In software, the development costs are high and then the unit costs in deployment quickly fall to near zero. As long as AI benefits from collecting user data to improve training (which should still be true in many non-foom scenarios) then we might expect network effect scaling behavior where the first to really capture a market niche becomes almost uncatchable, like Meta and Google and Amazon. Or where downstream app layers are built on software functionality, switching costs become very high and you get a substantial amount of lock-in, like with Apple and Microsoft.

Even if foom is going to happen, things would look very different if the leaders credibly committed to helping others foom if they are first. I don't know if this would be better or worse from a existential risk perspective, but it would change the nature of the race a lot.

Agreed. But, if any of the leading labs could credibly state what kinds of things they would or wouldn't be able to do in a foom scenario, let alone credibly precommit to what they would actually do, I would feel a whole lot better and safer about the possibility. Instead the leaders can't even precommit credibly to their own stated policies, in the absence of foom, and also don't have anywhere near a credible plan for managing foom if it happens.

Comment by AnthonyC on National Security Is Not International Security: A Critique of AGI Realism · 2025-02-05T21:21:14.570Z · LW · GW

I'd say I agree with just about all of that, and I'm glad to see it laid out so clearly!

I just also wouldn't be hugely surprised if it turns out something like designing and building remote-controllable self-replicating globally-deployable nanotech (as one example) is in some sense fundamentally "easy" for even an early ASI/modestly superhuman AGI. Say that's the case, and we build a few for the ASI, and then we distribute them across the world, in a matter of weeks. They do what controlled self-replicating nanobots do. Then after a few months the ASI already has an off switch or sleep mode button buried in everyone's brain. My guess is that then none of those hard steps of a war with China come into play. 

To be clear, I don't think this story is likely. But in a broad sense, I am generally of the opinion that most people greatly overestimate how much new data we need to answer new questions or create (some kinds of) new things, and underestimate what can be done with clever use of existing data, even among humans, let alone as we approach the limits of cleverness. 

Comment by AnthonyC on National Security Is Not International Security: A Critique of AGI Realism · 2025-02-05T14:43:45.924Z · LW · GW

I very much agree with the value of not expecting a silver bullet, not accelerating arms race dynamics, fostering cooperation, and recognizing in what ways AGI realism represents a stark break from the impacts of typical technological advances. The kind of world you're describing is a possibility, maybe a strong one, and we don't want to repeat arrogant past mistakes or get caught flat footed.

That said, I think this chain of logic hinges closely on just what "…at least for a while" means in practice, yes? If one side has enough of an AI lead to increase its general technological advantage over adversaries by a matter of what would be centuries of effort at the adversaries' then-current capability levels, then that's very different than if the leader is only a few minutes or months ahead. We should be planning for many eventualities, but as long as the former scenario is a possibility, I'm not sure how we can plan for it effectively without also trying to be first. As you note, technological advantage has rarely been necessary or sufficient, but not never. I don't like it one bit, but I'm not sure what to actually do about it.

The reason I say that is just that in the event that AGI-->ASI really does turn out to be very fast and enable extremely rapid technological advancement, then I'm not sure how the rest of the dynamics end up playing a role in that timeline. In that world, military action against an adversary could easily look like "Every attempt anyone else makes to increase their own AI capabilities any further gets pre-emptively and remotely shut down or just mysteriously fails. If ASI decides to act offensively, then near-immediately their every government and military official simultaneously falls unconscious, while every weapon system, vehicle, or computer they have is inoperable or no longer under there control. They no longer have a functioning electric grid or other infrastructure, either." In such a world, the political will to wage war no longer centers on a need to expend money, time, or lives. There's nothing Homo habilis can do to take down an F-35.

Again, I agree with you that no one should just assume the world will look like that to the exclusion of other paths. But if we want to avoid arms race dynamics, and that world is a plausible path, I don't think any proposed approach I've seen or heard of works convincingly enough that it could or should sway government and military strategy.

Comment by AnthonyC on The Personal Implications of AGI Realism · 2025-02-05T14:19:12.188Z · LW · GW

Of course I agree we won't attain any technology that is not possible, tautologically. And I have more than enough remaining uncertainty about what the mind is or what an identity entails that if ASI told me an upload wouldn't be me, I wouldn't really have a rebuttal. But the body and brain are an arrangement of atoms, and healthy bodies correspond to arrangements of atoms that are physically constructable. I find it hard to imagine what fundamental limitation could prevent the rearrangement of old-failing-body atoms into young-healthy-body atoms. If it's a practical limitation of repair complexity, then something like a whole-body-transplant seems like it could bypass the entire question.

Comment by AnthonyC on The Simplest Good · 2025-02-03T02:43:25.580Z · LW · GW

I don't think the idea is that happy moments are necessarily outweighed by suffering. It reads to me like it's the idea that suffering is inherent in existence, not just for humans but for all life, combined with a kind of negative utilitarianism. 

I think I would be very happy to see that first-half world, too. And depending on how we got it, yeah, it probably wouldn't go wrong in the way this story portrays. But, the principles that generate that world might actually be underspecified in something like the ways described, meaning that they allow for multiple very different ethical frameworks and we couldn't easily know in advance where such a world would evolve next. After all, Buddhism exists: Within human mindspace there is an attractor state for morality that aims at self-denial and cessation of consciousness as a terminal value. In some cases this includes venerating beings who vow to eternally intervene/remain in the world until everyone achieves such cessation; in others it includes honoring or venerating those who self-mummify through poisoning, dehydrating, and/or starving themselves. 

Humans are very bad at this kind of self-denial in practice, except for a very small minority. AIs need not have that problem. Imagine if, additionally, they did not inherit the pacifism generally associated with Buddhist thought but instead believed, like medieval Catholics, in crusades, inquisitions, and forced conversion. If you train an AI on human ethical systems, I don't know what combination of common-among-humans-and-good-in-context ideas it might end up generalizing or universalizing.

Comment by AnthonyC on Fertility Will Never Recover · 2025-02-01T05:51:10.681Z · LW · GW

The specifics of what I'm thinking of vary a lot between jurisdictions, and some of them aren't necessarily strictly illegal so much as "Relevant authorities might cause you a lot of problems even if you haven't broken any laws." But roughly speaking, I'm thinking about the umbrella of everything that kids are no longer allowed to do that increase demands on parents compared to past generations, plus all the rules and policies that collectively make childcare very expensive, and make you need to live in an expensive town to have good public schools. Those are the first categories that come to mind for me.

Comment by AnthonyC on [deleted post] 2025-02-01T05:42:21.501Z

Ah, yes, that does clear it up! I definitely am much more on board, sorry I misread the first time, and the footnote helps a lot.

 

As for the questions I asked that weren't clear, they're much less relevant now that I have your clarification. But the idea was: I'm off the opinion that we have a lot more know-how buried and latent in all our know-that data such that many things humans have never done or even thought of being able to do could nevertheless be overdetermined (or nearly so) without additional experimental data.

Comment by AnthonyC on [deleted post] 2025-01-31T17:21:47.200Z

Overall I agree with the statements here in the mathematical sense, but I disagree about how much to index on them for practical considerations. Upvoted because I think it is a well-laid-out description of a lot of peoples' reasons for believing AI will not be as dangerous as others fear.

First, do you agree that additional knowing-that reduces the amount of failure needed to achieve knowing-how? 

If not, are you also of the opinion that schools, education as a concept, books and similar storage media, or other intentional methods of imparting know-how between humans to have zero value? My understanding is that dissemination of information enabling learning from other people's past failures is basically the fundamental reason for humanity's success following the inventions of language, writing, and the printing press.

If so, where do you believe the upper bound on that failure-reduction-potential lies, in the limit of very high intelligence coupled with very high computing power? With how large an error bar on said upper bound? Why there? And does your estimate imply near-zero potential for the limit to be high enough to create catastrophic or existential risk?

Second, I agree that there is always a harder problem, and that such problems will still exist for anything that, to a human, would count as ASI. How certain are you that any given AI's limits will (in the important cases) only include things recognizable by humans in advance during planning or later during action as mistakes, in a way that reliably provides us opportunity to intervene in ways the AI did not anticipate as plausible failure modes? In other words, the universe may have limitless complexity, but it is not at all clear to me that the kinds of problems an AI would need to want to solve to present an existential risk to humans would require it to tackle much of that complexity. They may be problems a human could reliably solve given 1000 years subjective thinking time followed by 100 simultaneous "first tries" of various candidate plans, only one of which needs to succeed. If even one such case exists, I would expect anything worth calling an ASI to be able to figure such plans out in a matter of minutes, hours at most, maybe days if trying to only use spare compute that won't be noticed.

Third, I agree that it will often be in an AI's best interests, especially early on, to do nothing and bide its time, even if it thinks some plan could probably succeed but might become visible and get it destroyed or changed. This is where the concepts of deceptive alignment and a sharp left turn came from, 15-20 years ago IIRC, though the terminology and details have changed over time. However, at this point I expect that within the next couple of years millions of people will eagerly hand various AI systems near-unfettered access to their email, social media, bank and investment accounts, and so on. GPT-6 and its contemporaries will have access to millions of legal identities and many billions of dollars belonging to the kinds of people willing to mostly let an AI handle many details of their lives with minimal oversight. I see little reason to expect these systems will be significantly harder to jailbreak than all the releases so far. 

Fourth, even if it does take many years for any AI to feel established enough to risk enacting a dangerous plan, humans are human. Each year it doesn't happen will be taken as evidence it won't, and (some) people will be even less cautious than they are now. It seems to me that the baseline path is that the humans, and then the AIs, will be on the lookout for likely catastrophic capabilities failures, and iteratively fix them in order to make the AI more instrumentally useful, until the remaining failure modes that exist are outside our ability to anticipate or fix; then things chug along seeming to have gone well for some length of time, and we just kinda have to hope that length of time is very very long or infinite.

Comment by AnthonyC on Anthropic CEO calls for RSI · 2025-01-30T17:29:25.375Z · LW · GW

I hope it would, but I actually think it would depend on who or what killed whom, how, and whether it was really an accident at all.

If an American-made AI hacked the DOD and nuked Milan because someone asked it to find a way to get the 2026 Olympics moved, then I agree, we would probably get a push back against race incentives.

If a Chinese-made AI killed millions in Taiwan in an effort create an opportunity for China to seize control, that could possibly *accelerate* race dynamics.

Comment by AnthonyC on Fertility Will Never Recover · 2025-01-30T16:24:07.880Z · LW · GW

I think it's more a matter of Not Enough Dakka plus making it illegal to do those things in what should be reasonable ways. I agree there are economic (and regulatory) interventions that could make an enormous difference, but for various reasons I don't think any government is currently willing and able to implement them at scale. A crisis needs to be a lot more acute to motivate that scale of change.

Comment by AnthonyC on Lazy Hasselback Pommes Anna · 2025-01-27T22:44:58.921Z · LW · GW

You would think so, I certainly used to think so, but somehow it doesn't seem to work that way in practice. That's usually the step where my wife does the seasoning and adds the liquids, so IDK if there is something specific she does that makes it work. But I'm definitely whipping them with the whisk attachment, which incorporates air, and not beating them with a paddle attachment. I suspect that's the majority of why it works.

Comment by AnthonyC on The Clueless Sniper and the Principle of Indifference · 2025-01-27T22:39:15.493Z · LW · GW

I mentioned this in my comment above, but I think it might be worthwhile to differentiate more explicitly between probability distributions and probability density functions. You can have a monotonically-decreasing probability density function F(r) (aka the probability of being in some range is the integral of F(r) over that range, integral over all r values is normalized to 1) and have the expected value of r be as large as you want. That's because the expected value is the integral of r*F(r), not the value or integral of F(r).

I believe the expected value of r in the stated scenario is large enough that missing is the most likely outcome by far. I am seeing some people argue that the expected distribution is F(r,θ) in a way that is non-uniform in θ, which seems plausible. But I haven't yet seen anyone give an argument for the claim that the aimed-at point is not the peak of the probability density function, or that we have access to information that allows us to conclude that integrating the density function over the larger-and-aimed-at target region will not give us a higher value than integrating over the smaller-and-not-aimed-at child region

Comment by AnthonyC on The Clueless Sniper and the Principle of Indifference · 2025-01-27T22:18:35.876Z · LW · GW

So, as you noted in another comment, this depends on your understanding of the nature of the types of errors individual perturbations are likely to induce. I was automatically guessing many small random perturbations that could be approximated by a random walk, under the assumption that any systematic errors are the kind of thing the sniper could at least mostly adjust for even at extreme range. Which I could be easily convinced is completely false in ways I have no ability to concretely anticipate.

That said, whatever assumptions I make about the kinds of errors at play, I am implicitly mapping out some guessed-at probability density function. I can be convinced it skews left or right, down or up. I can be convinced, and already was, that it falls off at a rate such that if I define it in polar coordinates and integrate over theta that the most likely distance-from-targeted-point is some finite nonzero value. (This kind of reasoning comes up sometimes in statistical mechanics, since systems are often not actually at a/the maxentropy state, but instead within some expected phase-space distance of maxentropy, determined by how quickly density of states changes). 

But to convince me that the peak of the probability density function is somewhere other than the origin (the intended target), I think I'd have to be given some specific information about the types of error present that the sniper does not have in the scenario, or which the sniper knows but is somehow still unable to adjust for.  Lacking such information, then for decision making purposes, other than "You're almost certainly going to miss" (which I agree with!), it does seem to me that if anyone gets hit, the intended target who also has larger cross-sectional area seems at least a tiny bit more likely.

Comment by AnthonyC on Lazy Hasselback Pommes Anna · 2025-01-27T21:48:40.231Z · LW · GW

I used to use a ricer, but found that it always made the potatoes too cold by the time I ate them. Do you find this? If not, do you (even if you never thought of it this way) do anything specific to prevent it? If so, do you then reheat them, and how?

 

With a stand mixer and the whisk attachment I found removing the ricer step hasn't really mattered, but any other whipping method and yeah, it's very useful.

Comment by AnthonyC on Lazy Hasselback Pommes Anna · 2025-01-27T21:41:40.974Z · LW · GW

Fair enough, I moved into a small space a few years ago and mostly buy smaller quantities now. I also like that the Little Potato Company's potatoes are already washed and I'm often boondocking/on a limited water supply. 

Costco is generally above average in most things, so definitely a good choice. I find the brands I mentioned to be more consistently high quality across locations and over time, but not too much better at their respective bests. So when I need a specific meal to be high quality, like on holidays, I'll make sure to go to Trader Joe's.

FWIW the Trader Joe's organic golds are around $4/3lb bag. The Little Potato Company's bags are around $2-3/lb. I have bought both in at least 10 states each at this point and those price have been fairly consistent. I also don't want to spend a huge amount on potatoes.

Comment by AnthonyC on Lazy Hasselback Pommes Anna · 2025-01-27T21:33:10.973Z · LW · GW

In my experience that's true for a hand-held masher or hand mixer, but if I'm slow-whipping in a stand mixer with butter and cream, golds give a fluffier, smoother, lighter result.

Comment by AnthonyC on When do "brains beat brawn" in Chess? An experiment · 2025-01-27T14:25:24.408Z · LW · GW

I really enjoyed this piece, not because of the specific result, but because of the style of reasoning it represents. How much advantage, under what kind of rules, can be overcome with what level of intelligence? 

Sometimes the answer is none. "I play x" overwhelms any level of intelligence at tic tac toe. 

In larger and more open games the advantage of intelligence increases, because you can do more by being better at exploring the space of possible moves. 

"Real life" is plausibly the largest and most open game, where the advantage of intelligence is maximized. 

So, exploring the kind of question the OP posits can give us a kind of lower bound on how much advantage humans would need to defeat an arbitrarily smart opponent. And extending it to larger contexts can refine that bound. 

By the time we hit chess-complexity, against an opponent not trained for odds games, we're already at around two bishops odds for an uncommon-but-not-extreme level of human skill.

I think a lot of the problems that arise in discussing AI safety are a (in the best cases much more well reasoned) form of "You think an AI could overcome X odds? No way!" "Yes way!"

Comment by AnthonyC on The Clueless Sniper and the Principle of Indifference · 2025-01-27T14:09:31.881Z · LW · GW

The latter. And yes, I do agree with the superior on that specific, narrow mathematical question. If I am trying to run with the spirit and letter of the dilemma as presented, then I will bite that bullet (sorry, I couldn't resist). 

In real world situations, at the point where you somehow find yourself in such a position, the correct solution is probably "call in air support and bomb them instead, or find a way to fire many bullets at once- you've already decided you're willing to kill a child for a chance to to take out the target."

Similarly, if the terrorist were an unfriendly ASI and the child was the entire population of my home country, and there was knowably no one else in position to take any shot at all, I'd (hope I'm the kind of person who would) take the shot. A coin flip is better than certainty of death, even if it were biased against you quite heavily.

Comment by AnthonyC on Lazy Hasselback Pommes Anna · 2025-01-27T13:22:50.189Z · LW · GW

Interesting, why don't you like them for mashing? That's specifically what I like them best for. Although IIUC a knish needs a different texture to hold together well. I also don't use golds for (unbreaded) potato cakes unless I mash them in advance and use them left over.

Comment by AnthonyC on Lazy Hasselback Pommes Anna · 2025-01-27T13:16:03.057Z · LW · GW

I'm no chef, but I love to cook, and my thanksgiving meals are planned in spreadsheets with 10 minute increments of what goes where. Plus I currently live full-time in an RV so I've gotten used to improvising with nonstandard and less reliable tools. Take or leave my suggestions accordingly.

It's often a good idea, until and unless you know your oven really well, to put an oven thermometer in the oven on the rack and adjust accordingly. They're <$10. Try placing it in different spots and figure out how evenly or unevenly your oven heats, and how a pan in one spot affects temperature in another.

Composition and thickness of your pan also matters. Ovens heat from all sides, but it matters whether your food is sitting in glass, steel, thin aluminum, or thick aluminum. Cake mixes try to give different instructions for glass, metal, and dark metal, but it's going to vary by recipe.

And it matters whether you're using a convection or conventional oven. The standard advice is shorter times and lower temperatures for convection, but you might still get differences in terms of drying out the top before the bottom and center cook fully with convection. Maybe you have to cover it part of the time, for some recipes.

If you misjudge and want more crispiness, why not briefly broil at the end? Say you're trying to braise a roast in a pan next to, above, or below the dish of potatoes. Steam from the roast slows the cooking and prevents browning. Then when you take the roast out to rest, you have a couple of minutes to broil before serving.

Comment by AnthonyC on Lazy Hasselback Pommes Anna · 2025-01-27T13:03:32.030Z · LW · GW

Yukon Golds are objectively the best potato

Correct :-)

Do you have a specific type of gold you use? The best I can reliably get are the organic golds from Trader Joe's, they come in a 3 lb bag. When I'm making home fries or anything diced I also really like The Little Potato Company.

Edit to add: now you have me wanting to make potatoes au gratin. It's been a while since I've made a good cheese sauce.

Comment by AnthonyC on The Clueless Sniper and the Principle of Indifference · 2025-01-27T12:57:40.238Z · LW · GW

I think the principle is fine when applied to how variables effect movement of the bullet in space. I don't necessarily think it means taking the shot is the right call, tactically.

Note: I've never fired a real gun in any context, so a lot of the specifics of my reasoning are probably wrong but here goes anyway.

Essentially I see the POI as stating that the bullet takes a 2D random walk with unknown step sizes (though possibly with a known distribution of sizes) on its way to the target. As distance increase, variance in the random walk increases. 

Given typical bullet speeds we're talking about >6 seconds to reach the target, possibly much more depending on drag. And the sniper is actually pointing the rifle so it follows a parabolic arc to the target. In that time the bullet falls .5*g*t^2 meters, so the sniper is actually pointing at a spot at least 180m above his head, possibly 500m if drag makes the expected flight speed even a few seconds longer. More still to the extent the distance means the angle of the shot has to be so high you need to account for more vertical and less horizontal velocity plus more drag. Triple the distance means a lot more than 3x the number of opportunities for random factors to throw off the shot. The random factors are playing plinko with your bullet.

After the first mile (limit of known skill) the expected distance of where the bullet lands from the target increases. At some sufficiently far distance, it is essentially landing in a random spot in a normal distribution around the intended target, and whether it hits the terrorist or the child is mostly a function of how much area each takes up (the adult is larger), and the extent to which one is blocking the other (not stated). Regardless, when the variance is high enough, the most likely outcome is "neither." It hits the ground meters away, and now the terrorists all know you took the shot and from what direction. If the terrorist just started walking, there might be a better chance of hitting the building than anything else.

So, in a strict probabilistic sense, yes, your probability of hitting the terrorist is still higher than hitting the child. If that is the superior's sole criterion for decision making, they've reasoned correctly. That is not the sniper's decision making threshold (he wants a higher certainty of avoiding the child). I would expect it is also not the higher-ups' sole criterion, since it is most likely going to fail and alert the enemy about their position as well as the limits of their sniping capabilities. I have no idea whether the sniper has any legally-defensible way to refuse to follow the order, but if he carries it out, I don't think issuing the order will reflect well on the superior.

That said: In practice a lot would depend on why the heck they stationed a sniper three miles away from a target with no practical way to turn that positioning into achieving this goal. The first thought I have is that either you're not where you're supposed to be, or the people that ordered you there are idiots, and in either case your superior is flailing trying to salvage the situation. The second is that you're not really expected to take out this target at this range, and your superior is either trying to show off to his superior, or misunderstood his orders, or wasn't told the real reason for the orders. The third is that of course the sniper would have brought this up hours ago and already figured out the correct decision tree, it's crazy this conversation is only happening after the terrorist leaves the building.

Comment by AnthonyC on Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals · 2025-01-25T02:49:25.718Z · LW · GW

To the extent this reasoning works, it breaks the moment any agent has anything like a decisive strategic advantage. And that point no one else's instrumental or terminal goals can act as constraints.

Comment by AnthonyC on Monthly Roundup #26: January 2025 · 2025-01-21T11:36:10.184Z · LW · GW

This seems like it goes way too far. What exactly are we punishing the server or restaurant for if a patron drops some bills on the table and walks out when no one is looking?

Comment by AnthonyC on Passages I Highlighted in The Letters of J.R.R.Tolkien · 2025-01-16T02:02:11.581Z · LW · GW

Edit to add: Just thinking about the converse, you could also make it sound more ridiculous by rewriting it with more obscure parts of the legendarium, too.

Conquer Morgoth with Ungoliant. Turn Maiar into balrogs. Glamdring among the morgul-blades.

Comment by AnthonyC on Passages I Highlighted in The Letters of J.R.R.Tolkien · 2025-01-16T01:53:18.102Z · LW · GW

I would assume that his children in particular would be quite familiar with their usage, though, and that seems to be who a lot of the legendarium-heavy letters are written to.

I also think that it sounds at least slightly less ridiculous to rewrite that passage in the language of Star Wars rather than Starcraft. Conquer the Emperor with the Dark Side. Turn Jedi into Sith. An X-Wing among the TIE fighters. Probably because it's more culturally established, with a more deeply developed mythos.

Comment by AnthonyC on Voluntary Salary Reduction · 2025-01-15T18:50:55.341Z · LW · GW

How does this interact with MA's salary transparency laws? If you are in a role where no one else shares your title, then no problem. Otherwise, this could enable an employer to pressure others to take pay cuts or smaller raises, or it could force them to tell prospective new employees a much lower lower bound in the salary range for the role they're applying to.