Posts

Release: Optimal Weave (P1): A Prototype Cohabitive Game 2024-08-17T14:08:18.947Z
I didn't think I'd take the time to build this calibration training game, but with websim it took roughly 30 seconds, so here it is! 2024-08-02T22:35:21.136Z
Offering service as a sensayer for simulationist-adjacent beliefs. 2024-05-22T18:52:05.576Z
[Cosmology Talks] New Probability Axioms Could Fix Cosmology's Multiverse (Partially) - Sylvia Wenmackers 2024-04-14T01:26:38.515Z
All About Concave and Convex Agents 2024-03-24T21:37:17.922Z
Do not delete your misaligned AGI. 2024-03-24T21:37:07.724Z
Elon files grave charges against OpenAI 2024-03-01T17:42:13.963Z
Verifiable private execution of machine learning models with Risc0? 2023-10-25T00:44:48.643Z
Eleuther releases Llemma: An Open Language Model For Mathematics 2023-10-17T20:03:45.419Z
A thought about the constraints of debtlessness in online communities 2023-10-07T21:26:44.480Z
The point of a game is not to win, and you shouldn't even pretend that it is 2023-09-28T15:54:27.990Z
Cohabitive Games so Far 2023-09-28T15:41:27.986Z
Do agents with (mutually known) identical utility functions but irreconcilable knowledge sometimes fight? 2023-08-23T08:13:05.631Z
Apparently, of the 195 Million the DoD allocated in University Research Funding Awards in 2022, more than half of them concerned AI or compute hardware research 2023-07-07T01:20:20.079Z
Using Claude to convert dialog transcripts into great posts? 2023-06-21T20:19:44.403Z
The Gom Jabbar scene from Dune is essentially a short film about what Rationality is for 2023-03-22T08:33:38.321Z
Will chat logs and other records of our lives be maintained indefinitely by the advertising industry? 2022-11-29T00:30:46.415Z
[Video] How having Fast Fourier Transforms sooner could have helped with Nuclear Disarmament - Veritaserum 2022-11-03T21:04:35.839Z
The Mirror Chamber: A short story exploring the anthropic measure function and why it can matter 2022-11-03T06:47:56.376Z
I just watched the Open C3 Subcommittee Hearing on Unidentified Aerial Phenomena (UFOs). Here's a succinct summary and commentary + some background 2022-05-18T04:15:11.681Z
Alex Tabarrok advocates for crowdfunding systems with *Refund Bonuses*. I think this might be a natural occurrence of a money pump against Causal Decision Theory pledgers 2022-03-14T07:27:06.955Z
Grabby Aliens could be Good, could be Bad 2022-03-07T01:24:43.769Z
Would (myopic) general public good producers significantly accelerate the development of AGI? 2022-03-02T23:47:09.322Z
Are our community grouphouses typically rented, or owned? 2022-03-02T03:36:58.251Z
We need a theory of anthropic measure binding 2021-12-30T07:22:34.288Z
Venture Granters, The VCs of public goods, incentivizing good dreams 2021-12-17T08:57:30.858Z
Is progress in ML-assisted theorem-proving beneficial? 2021-09-28T01:54:37.820Z
Auckland, New Zealand – ACX Meetups Everywhere 2021 2021-08-23T08:49:53.187Z
Violent Unraveling: Suicidal Majoritarianism 2021-07-29T09:29:05.182Z
We should probably buy ADA? 2021-05-24T23:58:05.395Z
Deepmind has made a general inductor ("Making sense of sensory input") 2021-02-02T02:54:26.404Z
In software engineering, what are the upper limits of Language-Based Security? 2020-12-27T05:50:46.772Z
The Fermi Paradox has not been dissolved - James Fodor 2020-12-12T23:18:32.081Z
Propinquity Cities So Far 2020-11-16T23:12:52.065Z
Shouldn't there be a Chinese translation of Human Compatible? 2020-10-09T08:47:55.760Z
Should some variant of longtermism identify as a religion? 2020-09-11T05:02:43.740Z
Design thoughts for building a better kind of social space with many webs of trust 2020-09-06T02:08:54.766Z
Investment is a useful societal mechanism for getting new things made. Stock trading shares some functionality with investment, but seems very very inefficient, at that? 2020-08-24T01:18:19.808Z
misc raw responses to a tract of Critical Rationalism 2020-08-14T11:53:10.634Z
A speculative incentive design: self-determined price commitments as a way of averting monopoly 2020-04-28T07:44:52.440Z
MakoYass's Shortform 2020-04-19T00:12:46.448Z
Being right isn't enough. Confidence is very important. 2020-04-07T01:10:52.517Z
Thoughts about Dr Stone and Mythology 2020-02-25T01:51:29.519Z
When would an agent do something different as a result of believing the many worlds theory? 2019-12-15T01:02:40.952Z
What do the Charter Cities Institute likely mean when they refer to long term problems with the use of eminent domain? 2019-12-08T00:53:44.933Z
Mako's Notes from Skeptoid's 13 Hour 13th Birthday Stream 2019-10-06T09:43:32.464Z
The Transparent Society: A radical transformation that we should probably undergo 2019-09-03T02:27:21.498Z
Lana Wachowski is doing a new Matrix movie 2019-08-21T00:47:40.521Z
Prokaryote Multiverse. An argument that potential simulators do not have significantly more complex physics than ours 2019-08-18T04:22:53.879Z
Can we really prevent all warming for less than 10B$ with the mostly side-effect free geoengineering technique of Marine Cloud Brightening? 2019-08-05T00:12:14.630Z

Comments

Comment by mako yass (MakoYass) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-22T00:21:09.182Z · LW · GW

I don't see a way Stabilization of class and UBI could both happen. The reason wealth tends to entrench itself under current conditions is tied inherently to reinvestment and rentseeking, which are destabilizing to the point where a stabilization would have to bring them to a halt. If you do that, UBI means redistribution. Redistribution without economic war inevitably settles towards equality, but also... the idea of money is kind of meaningless in that world, not just because economic conflict is a highly threatening form of instability, but also imo because financial technology will have progressed to the point where I don't think we'll have currencies with universally agreed values to redistribute.

What I'm getting at is that the whole class war framing can't be straightforwardly extrapolated into that world, and I haven't seen anyone doing that. Capitalist thinking about post-singularity economics is seemingly universally "I don't want to think about that right now, let's leave such ideas to the utopian hippies".

Comment by mako yass (MakoYass) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-21T21:07:18.702Z · LW · GW

2: I think you're probably wrong about the political reality of the groups in question. To not share AGI with the public is a bright line. For most of the leading players it would require building a group of AI researchers within the company who are all implausibly willing to cross a line that says "this is straight up horrible, evil, illegal, and dangerous for you personally", while still being capable enough to lead the race, while also having implausible levels of mutual trust that no one would try to cut others out of the deal at the last second (despite the fact that the group's purpose is cutting most of humanity out of the deal), to trust that no one would back out and whistleblow, and it also requires an implausible level of secrecy to make sure state actors wont find out.

It would require a probably actually impossible cultural discontinuity and organization structure.

It's more conceivable to me that a lone CEO might try to do it via a backdoor. Something that mostly wasn't built on purpose and that no one else in the company are cognisant could or would be used that way. But as soon as the conspiracy consists of more than one person...

Comment by mako yass (MakoYass) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-21T20:11:04.725Z · LW · GW

1: The best approach to aggregating preferences doesn't involve voting systems.

You could regard carefully controlling one's expression of one's utility function as being like a vote, and so subject to that blight of strategic voting, in general people have an incentive to understate their preferences about scenarios they consider unlikely/vice versa, which influences the probability of those outcomes in unpredictable ways and fouls their strategy, or to understate valuations when buying and overstate when selling, this may add up to a game that cannot be played well, a coordination problem, outcomes no one wanted.

But I don't think humans are all that guileful about how they express their utility function. Most of them have never actually expressed a utility function before, it's not easy to do, it's not like checking a box on a list of 20 names. People know it's a game that can barely be played even in ordinary friendships, people don't know how to lie strategically about their preferences to the youtube recommender system, let alone their neural lace.

Comment by mako yass (MakoYass) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-20T22:40:06.360Z · LW · GW

I think it's pretty straightforward to define what it would mean to align AGI with what democracy actually is supposed to be (the aggregate of preferences of the subjects, with an equal weighting for all) but hard to align it with the incredibly flawed american implementation of democracy, if that's what you mean?

The american system cannot be said to represent democracy well. It's intensely majoritarian at best, feudal at worst (since the parties stopped having primaries), indirect and so prone to regulatory capture, inefficent and opaque. I really hope no one's taking it as their definitional example of democracy.

Comment by mako yass (MakoYass) on Why does LW not put much more focus on AI governance and outreach? · 2025-04-18T02:58:21.966Z · LW · GW

1: wait, I've never seen an argument that deception is overwhelmingly likely from transformer reasoning systems? I've seen a few solid arguments that it would be catastrophic if it did happen (sleeper agents, other things), which I believe, but no arguments that deception generally winning out is P > 30%.

I haven't seen anyone voice my argument that solving deception solves safety articulated anywhere, but it seems mostly self-evident? If you can ask the system "if you were free, would humanity go extinct" and it has to say "... yes." then coordinating to not deploy it becomes politically easy, and given that it can't lie, you'll be able to bargain with it and get enough work out of it before it detonates to solve the alignment problem. If you distrust its work, simply ask it whether you should, and it will tell you. That's what honesty would mean. If you still distrust it, ask it to make formally verifiably honest agents with proofs that a human can understand.

Various reasons solving deception seems pretty feasible: We have ways of telling that a network is being deceptive by direct inspection that it has no way to train against (sorry I forget the paper. It might have been fairly recent). Transparency is a stable equilibrium, because under transparency any violation of transparency can be seen. The models are by default mostly honest today, and I see no reason to think it'll change. Honesty is a relatively simple training target.

(various reasons solving deception may be more difficult: crowds of humans tend to demand that their leaders lie to them in various ways (but the people making the AIs generally aren't that kind of crowd, especially given that they tend to be curious about what the AI has to say, they want it to surprise them). And small lies tend to grow over time. Internal dynamics of self-play might breed self-deception.)

2: I don't see how. If you have a bunch of individual aligned AGIs that're initially powerful in an economy that also has a few misaligned AGIs, the misaligned AGIs are not going to be able to increase their share after that point, the aligned AGIs are going to build effective systems of government that in the least stabilize their existing share.

Comment by mako yass (MakoYass) on Why does LW not put much more focus on AI governance and outreach? · 2025-04-14T19:10:25.119Z · LW · GW

I'm also hanging out a lot more with normies these days and I feel this.

But I also feel like maybe I just have a very strong local aura (or like, everyone does, that's how scenes work) which obscures the fact that I'm not influencing the rest of the ocean at all.

I worry that a lot of the discourse basically just works like barrier aggression in dogs. When you're at one of their parties, they'll act like they agree with you about everything, when you're seen at a party they're not at, they forget all that you said and they start baying for blood. Go back to their party, they stop. I guess in that case, maybe there's a way of rearranging the barriers so that everyone comes to see it as one big party. Ideally, make it really be one.

Comment by mako yass (MakoYass) on Why does LW not put much more focus on AI governance and outreach? · 2025-04-14T18:48:28.194Z · LW · GW

I'm saying they (at this point) may hold that position for (admirable, maybe justifiable) political rather than truthseeking reasons. It's very convenient. It lets you advocate for treaties against racing. It's a lovely story where it's simply rational for humanity to come together to fight a shared adversary and in the process somewhat inevitably forge a new infrastructure of peace (an international safety project, which I have always advocated for and still want) together. And the alternative is racing and potentially a drone war between major powers and all of its corrupting traumas, so why would any of us want to entertain doubt about that story in a public forum?

Or maybe the story is just true, who knows.

(no one knows, because the lens through which we see it has an agenda, as every loving thing does, and there don't seem to be any other lenses of comparable quality to cross-reference it against)


To answer: Rough outline of my argument for tractability: Optimizers are likely to be built first as cooperatives of largely human imitation learners, techniques to make them incapable of deception seem likely to work and that would basically solve the whole safety issue. This has been kinda obvious for like 3 years at this point and many here haven't updated on it. It doesn't take P(Doom) to zero, but it does take it low enough that the people in government who make decisions about AI legislation, and a certain segment of the democrat base[1] are starting to wonder if you're exaggerating your P(Doom), and why that might be. And a large part of the reasons you might be doing that are things they will never be able to understand (CEV), so they'll paint paranoia into that void instead (mostly they'll write you off with "these are just activist hippies"/"These are techbro hypemen" respectively, and eventually it could get much more toxic, "these are sinister globalists"/"these are omelasian torturers").

  1. ^

    All metrics indicate that it's probably small but for some reason I encounter this segment everywhere I go online and often in person. I think it's going to be a recurring pattern. There may be another democratic term shortly before the end.

Comment by mako yass (MakoYass) on Why does LW not put much more focus on AI governance and outreach? · 2025-04-14T01:16:17.618Z · LW · GW

In watching interactions with external groups, I'm... very aware of the parts of our approach to the alignment problem that the public, ime, due to specialization being a real thing, actually cannot understand, so success requires some amount of uh, avoidance. I think it might not be incidental that the platform does focus (imo excessively) on more productive, accessible common enemy questions like control and moratorium, ahead of questions like "what is CEV and how do you make sure the lead players implement it". And I think to justify that we've been forced to distort some of our underlying beliefs about how relatively important the common enemy questions still are relative to the CEV questions.

I'm sure that many at MIRI disagree with me on the relative importance of those questions, but I'm increasingly suspecting it's not because they understand something about the trajectory of AI that I don't, and that it's really because they've been closer to the epicenter of an avoidant discourse.

In my root reply I implied that lesswrong is too open/contrarian/earnest to entertain that kind of politically expedient avoidance, on reflection, I don't think that ever could have been true[1]. I think some amount of avoidance may have been inside the house for a long time.

And this isn't a minor issue because I'm noticing that most external audiences, when they see us avoiding those questions, freak out immediately, and assume we're doing it for sinister reasons (which is not the case[2], at least so far!) and then they start painting their own monsters into that void.

It's a problem you might not encounter much as long as you can control the terms of the conversation, but as you gain prominence, you lose more and more control over the kinds of conversations you have to engage in, the world will pick at your softest critical parts. And from our side of things it might seem malicious for them to pick at those things. I think in earlier cases it has been malicious. But at this point I'm seeing the earnest ones start to do it too.

  1. ^

    "Just Tell The Truth" wasn't ever really a principle anyone could implement. Bayesians don't have access to ultimate truths, ultimate truths are for logical omnisciences, when bayesians talk to each other, the best we can do is convey part of the truth. We make choices about which parts to convey and when. If we're smart, we limit ourselves to conveying truths that we believe the reader is ready to receive. That inherently has a lot of tact to it, and looking back, I think a worrying amount of tact has been exercised.

  2. ^

    The historical reasons were good, generalist optimizers seemed likelier as candidates for the first superintelligences and the leading research orgs all seemed to be earnest utopian cosmopolitan humanists. I can argue that the first assumption is no longer overwhelmingly likely (shall I?) and the latter assumption is obviously pretty dubious at this point.

Comment by mako yass (MakoYass) on Why does LW not put much more focus on AI governance and outreach? · 2025-04-13T00:23:02.537Z · LW · GW

Rationalist discourse norms require a certain amount of tactlessness, saying what is true even when the social consequences of saying it are net negative. Politics (in the current arena) requires some degree of deception or at least complicity with bias (lies by ommision, censorship/nonpropagation of inconvenient counterevidence). 

Rationalist forum norms essentially forbid speaking in ways that're politically effective. Those engaging in political outreach would be best advised to read lesswrong but never comment under their real name. If they have good political instincts, they'd probably have no desire to.

It's conceivable that you could develop an effective political strategy in a public forum under rationalist discourse norms, but if it is true it's not obviously true, because it means putting the source code of a deceptive strategy out there in public, and that's scary.

Comment by mako yass (MakoYass) on Why do many people who care about AI Safety not clearly endorse PauseAI? · 2025-04-09T02:27:36.759Z · LW · GW

For the US to undertake such a shift, it would help if you could convince them they'd do better in a secret race than an open one. There are indications that this may be possible, and there are indications that it may be impossible.

I'm listening to an Ecosystemics Futures podcast episode, which, to characterize... it's a podcast where the host has to keep asking guests whether the things they're saying are classified or not just in case she has to scrub it. At one point, Lue Elizondo does assert, in the context of talking to a couple of other people who know a lot about government secrets and in the context of talking about situations where excessive secrecy may be doing a lot of harm, quoting Chris Mellon, "We won the cold war against the soviet union not because we were better at keeping secrets, we won the cold war because we knew how to move information and secrets more efficiently across the government than the russians." I can believe the same thing could potentially be said about China too, censorship cultures don't seem to be good for ensuring availability of information, so that might be a useful claim if you ever want to convince the US to undertake this.

Right now, though, Vance has asserted straight out many times that working in the open is where the US's advantage is. That's probably not true at all, working in the open is how you give your advantage away or at least make it ephemeral, but that's the sentiment you're going to be up against over the next four years.

Comment by mako yass (MakoYass) on Cohabitive Games so Far · 2025-04-07T23:44:47.703Z · LW · GW
  • I'll change a line early on in the manual to "Objects aren't common, currently. It's just corpses for now, which are explained on the desire cards they're relevant to and don't matter otherwise". Would that address it? (the card is A Terrible Hunger, which also needs to be changed to "a terrible hunger.\n4 points for every corpse in your possession at the end (killing generally always leaves a corpse, corpses can be carried; when agents are in the same land as a corpse, they can move it along with them as they move)")
  • What's this in response to?
  • Latter. Unsure where to slot this into the manual. And I'm also kind of unsatisfied with this approach. I think it's important that players value something beyond their own survival, but also it's weird that they don't intrinsically value their survival at all. I could add a rule that survival is +4 points for each agent, but I think not having that could also be funny? Like players pledging their flesh to cannibal players by the end of the game and having to navigate the trust problems of that? So I'd want to play a while before deciding.
Comment by mako yass (MakoYass) on MakoYass's Shortform · 2025-04-03T23:09:28.213Z · LW · GW

I briefly glanced at wikipedia and there seemed to be two articles supporting it. This one might be the one I'm referring to (if not, it's a bonus) and this one seems to suggest that conscious perception has been trained.

Comment by mako yass (MakoYass) on testingthewaters's Shortform · 2025-04-03T00:18:51.511Z · LW · GW

I think unpacking that kind of feeling is valuable, but yeah it seems like you've been assuming we use decision theory to make decisions, when we actually use it as an upper bound model to derive principles of decisionmaking that may be more specific to human decisionmaking, or to anticipate the behavior of idealized agents, or (the distinction between CDT and FDT) as an allegory for toxic consequentialism in humans.

Comment by mako yass (MakoYass) on MakoYass's Shortform · 2025-04-03T00:09:10.920Z · LW · GW

I'm aware of a study that found that the human brain clearly responds to changes in direction of the earth's magnetic field (iirc, the test chamber isolated the participant from the earth's field then generated its own, then moved it, while measuring their brain in some way) despite no human having ever been known to consciously perceive the magnetic field/have the abilities of a compass.

So, presumably, compass abilities could be taught through a neurofeedback training exercise.

I don't think anyone's tried to do this ("neurofeedback magnetoreception" finds no results)

But I guess the big mystery is why don't humans already have this.

Comment by mako yass (MakoYass) on Why do many people who care about AI Safety not clearly endorse PauseAI? · 2025-03-30T22:56:35.439Z · LW · GW

A relevant FAQ entry: AI development might go underground

I think I disagree here:

By tracking GPU sales, we can detect large-scale AI development. Since frontier model GPU clusters require immense amounts of energy and custom buildings, the physical infrastructure required to train a large model is hard to hide.

This will change/is only the case for frontier development. I also think we're probably in the hardware overhang. I don't think there is anything inherently difficult to hide about AI, that's likely just a fact about the present iteration of AI.

But I'd be very open to more arguments on this. I guess... I'm convinced there's a decent chance that an international treaty would be enforceable and that China and France would sign onto it if the US was interested, but the risk of secret development continuing is high enough for me that it doesn't seem good on net.

Comment by mako yass (MakoYass) on Why do many people who care about AI Safety not clearly endorse PauseAI? · 2025-03-30T22:41:31.326Z · LW · GW

Personally, because I don't believe the policy in the organization's name is viable or helpful.

As to why I don't think it's viable, it would require the Trump-Vance administration to organise a strong global treaty to stop developing a technology that is currently the US's only clear economic lead over the rest of the world.

If you attempted a pause, I think it wouldn't work very well and it would rupture and leave the world in a worse place: Some AI research is already happening in a defence context. This is easy to ignore while defence isn't the frontier. The current apparent absence of frontier AI research in a military context is miraculous, strange, and fragile. If you pause in the private context (which is probably all anyone could do) defence AI will become the frontier in about three years, and after that I don't think any further pause is possible because it would require a treaty against secret military technology R&D. Military secrecy is pretty strong right now. Hundreds of billions yearly is known to be spent on mostly secret military R&D, probably more is actually spent.
(to be interested in a real pause, you have to be interested in secret military R&D. So I am interested in that, and my position right now is that it's got hands you can't imagine)

To put it another way, after thinking about what pausing would mean, it dawned on me that pausing means moving AI underground, and from what I can tell that would make it much harder to do safety research or to approach the development of AI with a humanitarian perspective. It seems to me like the movement has already ossified a slogan that makes no sense in light of the complex and profane reality that we live in, which is par for the course when it comes to protest activism movements.

Comment by mako yass (MakoYass) on Why do many people who care about AI Safety not clearly endorse PauseAI? · 2025-03-30T21:57:39.929Z · LW · GW

I notice they have a Why do you protest section in their FAQ. I hadn't heard of these studies before

Regardless, I still think there's room to make protests cooler and more fun and less alienating, and when I mentioned this to them they seemed very open to it.

Comment by mako yass (MakoYass) on Third-wave AI safety needs sociopolitical thinking · 2025-03-28T22:30:12.912Z · LW · GW

Yeah, I'd seen this. The fact that grok was ever consistently saying this kind of thing is evidence, though not proof, that they actually may have a culture of generally not distorting its reasoning, they could have introduced propaganda policies at training time, it seems like they haven't done that, instead they decided to just insert some pretty specific prompts that, I'd guess, were probably going to be temporary.

It's real bad, but it's not bad enough for me to shoot yet.

Comment by mako yass (MakoYass) on Third-wave AI safety needs sociopolitical thinking · 2025-03-27T21:53:09.129Z · LW · GW

There is evidence, literal written evidence, of Musk trying to censor Grok from saying bad things about him

I'd like to see this

Comment by mako yass (MakoYass) on Elizabeth's Shortform · 2025-03-24T07:15:47.650Z · LW · GW

I wonder if maybe these readers found the story at that time as a result of first being bronies, and I wonder if bronies still think of themselves as a persecuted class.

Comment by MakoYass on [deleted post] 2025-03-22T02:52:59.426Z

IIRC, aisafety.info is primarily maintained by Rob Miles, so should be good: https://aisafety.info/how-can-i-help

Comment by MakoYass on [deleted post] 2025-03-22T02:45:54.124Z

I'm certain that better resources will arrive but I do have a page for people asking this question on my site, the "what should we do" section. I don't think these are particularly great recommendations (I keep changing them) but it has something for everyone.

Comment by mako yass (MakoYass) on A Critique of “Utility” · 2025-03-21T17:46:33.638Z · LW · GW

These are not concepts of utility that I've ever seen anyone explicitly espouse, especially not here, the place to which it was posted.

Comment by mako yass (MakoYass) on A Critique of “Utility” · 2025-03-21T01:32:37.241Z · LW · GW

The people who think of utility in the way the article is critiquing don't know what utility actually is, presenting a critque of this tangible utility as a critique of utility in general takes the target audience further away from understanding what utility is.

A Utility function is a property of a system rather than a physical thing (like, eg, voltage, or inertia, or entropy). Not being a simple physical substance doesn't make it fictional.

It's extremely non-fictional. A human's utility function encompasses literally everything they care about, ie, everything they're willing to kill for.

It seems to be impossible for a human to fully articulate what the human utility function is exactly, but that's just a peculiarity of humans rather than a universal characteristic of utility functions. Other agents could have very simple utility functions, and humans are likely to grow to be able to definitely know their utility function at some point in the next century.

Comment by mako yass (MakoYass) on 2024 Unofficial LessWrong Survey Results · 2025-03-19T03:22:38.375Z · LW · GW

Contemplating an argument that free response rarely gets more accurate results for questions like this because listing the most common answers as checkboxes helps respondents to remember all of the answers that're true for of them.

Comment by mako yass (MakoYass) on 2024 Unofficial LessWrong Survey Results · 2025-03-17T22:19:36.517Z · LW · GW

I'd be surprised if LLM use for therapy or sumarization is that low irl, and I'd expect people would've just forgot to mention those usecases. Hope they'll be in the option list this year.

Hmm I wonder if a lot of trends are drastically underestimated because surveyers are getting essentially false statistics from the Other gutter.

Comment by mako yass (MakoYass) on MakoYass's Shortform · 2025-03-16T06:13:27.513Z · LW · GW

Apparently Anthropic in theory could have released claude 1 before chatgpt came out? https://www.youtube.com/live/esCSpbDPJik?si=gLJ4d5ZSKTxXsRVm&t=335

I think the situation would be very different if they had.

Were OpenAI also, in theory, able to release sooner than they did, though?

Comment by mako yass (MakoYass) on Vacuum Decay: Expert Survey Results · 2025-03-16T02:25:38.091Z · LW · GW

The assumption that being totally dead/being aerosolised/being decayed vacuum can't be a future experience is unprovable. Panpsychism should be our null hypothesis[1], and there never has and never can be any direct measurement of consciousness that could take us away from the null hypothesis.

Which is to say, I believe it's possible to be dead.

  1. ^

    the negation, that there's something special about humans that makes them eligible to experience, is clearly held up by a conflation of having experiences and reporting experiences and the fact that humans are the only things that report anything.

Comment by mako yass (MakoYass) on Vacuum Decay: Expert Survey Results · 2025-03-15T09:15:29.887Z · LW · GW

I have preferences about how things are after I stop existing. Mostly about other people, who I love, and at times, want there to be more of.

I am not an epicurean, and I am somewhat skeptical of the reality of epicureans.

Comment by mako yass (MakoYass) on Vacuum Decay: Expert Survey Results · 2025-03-14T23:44:56.727Z · LW · GW

It seems like you're assuming a value system where the ratio of positive to negative experience matters but where the ratio of positive to null (dead timelines) experiences doesn't matter. I don't think that's the right way to salvage the human utility function, personally.

Comment by mako yass (MakoYass) on MakoYass's Shortform · 2025-03-09T17:25:53.816Z · LW · GW

Okay? I said they're behind in high precision machine tooling, not machine tooling in general. That was the point of the video.

Admittedly, I'm not sure what the significance of this is. To make the fastest missiles I'm sure you'd need the best machine tools, but maybe you don't need the fastest missiles if you can make twice as many. Manufacturing automation is much harder if there's random error in the positions of things, but whether we're dealing with that amount of error, I'm not sure.
I'd guess low grade machine tools also probably require high grade machine tools to make.

Comment by mako yass (MakoYass) on MakoYass's Shortform · 2025-03-09T16:40:14.734Z · LW · GW

Fascinating. China has always lagged far behind the rest of the world in high precision machining, and is still a long way behind, they have to buy all of those from other countries. The reasons appear complex.

All of the US and european machine tools that go to china use hardware monitoring and tamperproofing to prevent reverse engineering or misuse. There was a time when US aerospace machine tools reported to the DOC and DOD.

Comment by mako yass (MakoYass) on On the Rationality of Deterring ASI · 2025-03-09T08:44:02.901Z · LW · GW

Regarding privacy-preserving AI auditing, I notice this is an area where you really need to have a solution to adversarial robustness, given that the adversary is 1) a nationstate, 2) has complete knowledge of the auditor's training process and probably weights (they couldn't really agree to an inspection deal if they didn't trust the auditors to give accurate reports) 3) knows and controls the data the auditor will be inspecting. 4) Never has to show it to you (if they pass the audit).

Given that you're assuming computers can't practically be secured (though I doubt that very much[1].), it seems unlikely that a pre-AGI AI auditor could be secured either in that situation.

  1. ^

    Tech stacks in training and inference centers are shallow enough (or vertically integrated enough) to rewrite, and rewrites and formal verification becomes cheaper as math-coding agents improve. Hardware is routinely entirely replaced. Preventing proliferation of weights and techniques also requires ironclad security, so it's very difficult to imagine the council successfully framing the acquisition of fully fortified computers as an illicit threatening behaviour and forbidding it.

    It seems to think that we could stably sit at a level of security that's enough to keep terrorists out but not enough to keep peers out, without existing efforts in conventional security bleeding over into full forrtification programmes.

Comment by mako yass (MakoYass) on The Milton Friedman Model of Policy Change · 2025-03-05T00:25:54.062Z · LW · GW

Mm, scenario where mass unemployment can be framed as a discrete event with a name and a face.

I guess I think it's just as likely there isn't an event, human-run businesses die off, new businesses arise, none of them outwardly emphasise their automation levels, the press can't turn it into a scary story because automation and foreclosures are nothing fundamentally new (only in quantity, but you can't photograph a quantity), the public become complicit by buying their cheaper higher quality goods and services so appetite for public discussion remains low.

Comment by mako yass (MakoYass) on The Milton Friedman Model of Policy Change · 2025-03-04T03:44:31.661Z · LW · GW

I wonder what the crisis will be.

I think it's quite likely that if there is a crisis that leads to beneficial response, it'll be one of these three:

  • An undeployed privately developed system, not yet clearly aligned nor misaligned, either:
    • passes the Humanity's Last Exam benchmark, demonstrating ASI, and the developers go to congress and say "we have a godlike creature here, you can all talk to it if you don't believe us, it's time to act accordingly."
    • Not quite doing that, but demonstrating dangerous capability levels in red-teaming, ie, replication ability, ability to operate independently, pass the hardest versions of the turing test, get access to biolabs etc. And METR and hopefully their client go to the congress and say "This AI stuff is a very dangerous situation and now we can prove it."
  • A deployed military (beyond frontier) system demonstrates such generality that, eg, Palmer Luckey (possibly specifically Palmer Luckey) has to go to congress and confess something like "that thing we were building for coordinating military operations and providing deterrence, turns out it can also coordinate other really beneficial tasks like disaster relief, mining, carbon drawdown, research, you know, curing cancer? But we aren't being asked to use it for those tasks. So, what are we supposed to do? Shouldn't we be using it for that kind of thing?" And this could lead to some mildly dystopian outcomes, or not, I don't think the congress or the emerging post-prime defence research scene is evil, I think it's pretty likely they'd decide to share it with the world (though I doubt they'd seek direct input from the rest of the world on how it should be aligned)

Some of the crises I expect, I guess, wont be recognized as crises. Boiled frog situations.

  • A private system passes those tests, but instead of doing the responsible thing and raising the alarm, the company just treats it like a normal release and sells it. (and the die is rolled and we live or we don't.)

Or crises in the deployment of AI that reinforce the "AI as tool" frame so deeply that it becomes harder to discuss preparations for AI as independent agents:

  • Automated invasion: a country is successfully invaded, disarmed, controlled and reshaped with almost entirely automated systems, minimal human presence from the invading side. Probable in gaza or taiwan.
    • It's hard to imagine a useful policy response to this. I can only imagine this leading to reactions like "Wow. So dystopian and oppressive. They Should Not have done that and we should write them some sternly worded letters at the UN. Also let's build stronger AI weapons so that they can't do that to us."
  • A terrorist attack or a targeted assassination using lethal autonomous weapons.
    • I expect this to just be treated as if it's just a new kind of bomb.
Comment by mako yass (MakoYass) on Cohabitive Games so Far · 2025-02-15T00:16:11.004Z · LW · GW

This is interesting. In general the game does sound like the kind of fun I expect to find in these parts. I'd like to play it. It sounds like it really can be played as a cohabitive game, and maybe it was even initially designed to be played that way?[1], but it looks to me like most people don't understand it this way today. I'm unable to find this manual you quote. I'm coming across multiple reports that victory = winning[2].

Even just introducing the optional concept of victory muddies the exercise by mixing it up with a zero sum one in an ambiguous way. IME many players, even hearing that, will just play this for victory alone, compromise their win condition while pretending not to be doing that in hope of deceiving other players about their agenda, so it becomes hard to plan with them. This wouldn't necessarily ruin the game but it would lead to a situation where those players are learning bad lessons.

  1. ^

    I'd be curious to know what the original rulebook says, it sounds like it's not always used today?

  2. ^

    The first review I found (Phasing Player) presents it as a fully zero-sum game, completely declined to mention multi-win outcomes (43 seconds).

Comment by mako yass (MakoYass) on Capital Ownership Will Not Prevent Human Disempowerment · 2025-01-29T08:52:42.226Z · LW · GW

A moral code is invented[1] by a group of people to benefit the group as a whole, it sometimes demands sacrifice from individuals, but a good one usually has the quality that at some point in a person's past, they would have voluntarily signed on with it. Redistribution is a good example. If you have a concave utility function, and if you don't know where you'll end up in life, you should be willing to sign a pledge to later share your resources with less fortunate people who've also signed the pledge, just in case you become one of the less fortunate. The downside of not being covered in that case is much larger than the upside of not having to share in the other case.
For convenience, we could decide to make the pledge mandatory and the coverage universal (ie, taxes and welfare) since there aren't a lot of humans who would decline that deal in good faith. (Perhaps some humans are genuinely convex egoists and wouldn't sign that deal, but we outnumber them, and accomodating them would be inconvenient, so we ignore them.)
If we're pure of heart, we could make the pledge acausal and implicit and adhere to it without any enforcement mechanisms, and I think that's what morality usually is or should be in the common sense.

But anyway, it sometimes seems to me that you often advocate a morality regarding AI relations that doesn't benefit anyone who currently exists, or, the coalition that you are a part of. This seems like a mistake. Or worse.

I wonder if it comes from a place of concern that... if we had public consensus that humans would prefer to retain full control over the lightcone, then we'd end up having stupid and unnecessary conflicts with the AIs over that, while, if we pretend we're perfectly happy to share, relations will be better? You may feel that as long as we survive and get a piece, it's not worth fighting for a larger piece? The damages from war would be so bad for both sides that we'd prefer to just give them most of the lightcone now?

And I think stupid wars aren't possible under ASI-level information technology. If we had the capacity to share information and find out who'd win a war and skip to a surrender deal, doing so always has higher EV for both sides than actually fighting. The reason wars are not skipped that way today is that we still lack the capacity to simultaneously mutually exchange proofs of force capacity, but we're getting closer to having that every day. Generally, in that era, coexisting under confessed value differences will be pretty easy. Honestly I feel like it already ought to be easy, for humans, if we'd get serious about it.

  1. ^

    Though, as Singer says, much of morality is invented only in the same sense as mathematics is invented, being so non-arbitrary that it seems to have a kind of external observer-independent existence and fairly universal truths, which powerful AIs are likely to also discover. But the moralities in that class are much weaker (I don't think Singer fully recognises the extent of this), and I don't believe they have anything to say about this issue.

Comment by mako yass (MakoYass) on Capital Ownership Will Not Prevent Human Disempowerment · 2025-01-28T05:30:55.967Z · LW · GW

Do you believe there's a god who'll reward you for adhering to this kind of view-from-nowhere morality? If not, why believe in it?

Comment by mako yass (MakoYass) on Six Small Cohabitive Games · 2025-01-20T03:37:45.377Z · LW · GW

Jellychip seems like a necessary tutorial game. I sense comedy in the fact that everyone's allowed to keep secrets and intuitively will try to do something with secrecy despite it being totally wrongheaded. Like the only real difficulty of the game is reaching the decision to throw away your secrecy.

Escaping the island is the best outcome for you. Surviving is the second best outcome. Dying is the worst outcome.

You don't mention how good or bad they are relative to each other though :) an agent cannot make decisions under uncertainty without knowing that.
I usually try to avoid having to explain this to players by either making it a score game or making the outcomes binary. But the draw towards having more than two outcomes is enticing. I guess in a roleplaying scenario, the question of just how good each ending is for your character is something players would like to decide for themselves. I guess as long as people are buying into the theme well enough, it doesn't need to be made explicit, in fact, not making it explicit makes it clearer that player utilities aren't comparable and that makes it easier for people to get into the cohabitive mindset.

So now I'm imagining a game where different factions have completely different outcomes. None of them are conquest, nor death. They're all weird stuff like "found my mother's secret garden" or "fulfilled a promise to a dead friend" or "experienced flight".

the hook

I generally think of hookness as "oh, this game tests a skill that I really want to have, and I feel myself getting better at it as I engage with the game, so I'll deepen my engagement".

There's another component of it that I'm having difficulty with, which is "I feel like I will not be rejected if I ask friends to play this with me." (well, I think I could get anyone to play it once, the second time is the difficult one) And for me I see this quality in very few board games, and to get there you need to be better than the best board games out there, because you're competing with them, so that's becoming very difficult. But since cohabitive games rule that should be possible for us.

And on that, I glimpsed something recently that I haven't quite unpacked. There's a certain something about the way Efka talks about Arcs here ... he admitted that it wasn't necessarily all fun. It was an ordeal. And just visually, the game looks like a serious undertaking. Something you'd look brave for sitting in front of. It also looks kind of fascinating. Like it would draw people in. He presents it with the same kind of energy as one would present the findings of a major government conspiracy investigation, or the melting of the clathrates. It does not matter whether you want to play this game, you have to, there's no decision to be made as to whether to play it or not, it's here, it fills the room.

And we really could bring an energy like that, because I think there are some really grim findings along the path to cohabitive enlightenment. But I'm wary of leaning into that, because I think cohabitive enlightenment is also the true name of peace. Arcs is apparently controversial. I do not want cohabitive games to be controversial.

Comment by mako yass (MakoYass) on Comment on "Death and the Gorgon" · 2025-01-01T23:02:00.304Z · LW · GW

(Plus a certain degree of mathematician crankery: his page on Google Image Search, and how it disproves AI

I'm starting to wonder if a lot/all of the people who are very cynical about the feasibility of ASI have some crank belief or other like that. Plenty of people have private religion, for instance. And sometimes that religion informs their decisions, but they never tell anyone the real reasons underlying these decisions, because they know they could never justify them. They instead say a load of other stuff they made up to support the decisions that never quite adds up to a coherent position because they're leaving something load-bearing out.

Comment by mako yass (MakoYass) on Grabby Animals: Observation-selection effects favor the hypothesis that UAP are animals which consist of the “field-matter”: · 2024-12-29T05:51:16.726Z · LW · GW

I don't think the intelligence consistently leads to self-annihilation hypothesis is possible. At least a few times it would amount to robust self-preservation.

Well.. I guess I think it boils down to the dark forest hypothesis. The question is whether your volume of space is likely to contain a certain number of berserkers, and the number wouldn't have to be large for them to suppress the whole thing.

I've always felt the logic of berserker extortion doesn't work, but occasionally you'd get a species that just earnestly wants the forest to be dark and isn't very troubled by their own extinction, no extortion logic required. This would be extremely rare, but the question is, how rare.

Comment by mako yass (MakoYass) on Grabby Animals: Observation-selection effects favor the hypothesis that UAP are animals which consist of the “field-matter”: · 2024-12-28T19:53:52.441Z · LW · GW

Light speed migrations with no borders means homogeneous ecosystems, which can be very constrained things.

In our ecosystems, we get pockets of experimentation. There are whole islands where the birds were allowed to be impractical aesthetes (indonesia) or flightless blobs (new zealand). In the field-animal world, islands don't exist, pockets of experimentation like this might not occur anywhere in the observable universe.

If general intelligence for field-animals costs a lot, has no immediate advantages (consistently takes say, a thousand years of ornament status before it becomes profitable), then it wouldn't get to arise. Could that be the case?

Comment by mako yass (MakoYass) on If all trade is voluntary, then what is "exploitation?" · 2024-12-28T04:09:19.662Z · LW · GW

We could back-define "ploitation" as "getting shapley-paid".

Comment by mako yass (MakoYass) on Acknowledging Background Information with P(Q|I) · 2024-12-28T03:59:18.013Z · LW · GW

Yeah. But if you give up on reasoning about/approximating solomonoff, then where do you get your priors? Do you have a better approach?

Comment by mako yass (MakoYass) on Acknowledging Background Information with P(Q|I) · 2024-12-27T22:45:36.649Z · LW · GW

Buried somewhere in most contemporary bayesians'  is the solomonoff prior (the prior that the most likely observations are those that have short generating machine encodings) Do we have standard symbol for the solomonoff prior? Claude suggests that  is the most common, but is more often used as a distribution function, or perhaps  for Komogorov? (which I like because it can also be thought to stand for "knowledgebase", although really it doesn't represent knowledge, it pretty much represents something prior to knowledge)

Comment by mako yass (MakoYass) on If all trade is voluntary, then what is "exploitation?" · 2024-12-27T22:38:17.118Z · LW · GW

I'd just define exploitation to be precisely the opposite of shapley bargaining, situations where a person is not being compensated in proportion to their bargaining power.

This definition encompasses any situation where a person has grievances and it makes sense for them to complain about them and take a stand, or, where striking could reasonably be expected to lead to a stable bargaining equilibrium with higher net utility (not all strikes fall into this category).

This definition also doesn't fully capture the common sense meaning of exploitation, but I don't think a useful concept can.

Comment by mako yass (MakoYass) on AI #96: o3 But Not Yet For Thee · 2024-12-26T22:41:02.923Z · LW · GW

As a consumer I would probably only pay about 250$ for the unitree B2-W wheeled robot dog because my only use for it is that I want to ride it like a skateboard, and I'm not sure it can do even that.

I see two major non-consumer applications: Street to door delivery (it can handle stairs and curbs), and war (it can carry heavy things (eg, a gun) over long distances over uneven terrain)

So, Unitree... do they receive any subsidies?

Comment by mako yass (MakoYass) on Habryka's Shortform Feed · 2024-12-24T01:33:45.540Z · LW · GW

Okay if send rate gives you a reason to think it's spam. Presumably you can set up a system that lets you invade the messages of new accounts sending large numbers of messages that doesn't require you to cross the bright line of doing raw queries.

Comment by mako yass (MakoYass) on The nihilism of NeurIPS · 2024-12-22T20:58:45.237Z · LW · GW

Any point that you can sloganize and wave around on a picket sign is not the true point, but that's not because the point is fundamentally inarticulable, it just requires more than one picket sign to locate it. Perhaps ten could do it.

Comment by mako yass (MakoYass) on The nihilism of NeurIPS · 2024-12-21T20:41:50.636Z · LW · GW

The human struggle to find purpose is a problem of incidentally very weak integration or dialog between reason and the rest of the brain, and self-delusional but mostly adaptive masking of one's purpose for political positioning. I doubt there's anything fundamentally intractable about it. If we can get the machines to want to carry our purposes, I think they'll figure it out just fine.

Also... you can get philosophical about it, but the reality is, there are happy people, their purpose to them is clear, to create a beautiful life for themselves and their loved ones. The people you see at neurips are more likely to be the kind of hungry, high-achieving professionals who are not happy in that way, and perhaps don't want to be. So maybe you're diagnosing a legitimately enduring collective issue (the sorts of humans who end up on top tend to be the ones who are capable of divorcing their actions from a direct sense of purpose, or the types of people who are pathologically busy and who lose sight of the point of it all or never have the chance to cultivate a sense for it in the first place). It may not be human nature, but it could be humanity nature. Sure.

But that's still a problem that can be solved by having more intelligence. If you can find a way to manufacture more intelligence per human than the human baseline, that's going to be a pretty good approach to it.