jbash

Posts
Comments

Posts

jbash's Shortform 2025-01-05T16:43:43.572Z

Good News, Everyone! 2023-03-25T13:48:22.499Z

Comments

Comment by jbash on Token and Taboo · 2025-04-24T20:49:58.380Z · LW · GW

This means that moral progress can require intellectual progress.

It's a pretty big assumption to claim that "moral progress" is a thing at all.

A couple of those might have been less taboo 300 years ago than they are now. How does that square with the idea of progress?

Here are a few sample answers scored as genuine taboos.

Did you leave any answers out because they were too taboo to mention? Either because you wouldn't feel comfortable putting them in the post, or because you simply thought they were insanely odious and therefore obvious mistakes?

Comment by jbash on Illiteracy in Silicon Valley · 2025-04-20T15:43:12.955Z · LW · GW

From Wikipedia:

In 331 BC, a deadly epidemic hit Rome and at least 170 women were executed for causing it by veneficium.[18] In 184–180 BC, another epidemic hit Italy, and about 5,000 people were brought to trial and executed for veneficium.[17] If the reports are accurate, writes Hutton, "then the Republican Romans hunted witches on a scale unknown anywhere else in the ancient world".[17]

... and anyway it's not very convincing to single out witch hunting among all the other things people have always done, because people have always been shitty. Including, but by no means limited to, massive amounts of "scapegoating and blame".

The ancient past was terrifically violent.

Comment by jbash on Illiteracy in Silicon Valley · 2025-04-20T15:19:18.394Z · LW · GW

Some hypothetical past person's not being able to recognize their despicable cruelty doesn't preclude their being able to recognize your despicable cruelty. Even given relatively compatible values, everybody gets their own special set of blind spots.

I do agree that romanticizing the past to vilify the present is wrong, though. And not good scholarship if you don't bring a lot of evidence along with you. The idea that modernity is "the problem" is badly suspect. So is the idea that "the central values of this era are largely those of biological competition and survival" and that's somehow different from the past. The past has a whole lot of this group slaughtering that group and justifying it with "survival" arguments... assuming that they bothered to justify it at all. Sometimes it seems to have been just viewed as the natural order of things. Nothing new there.

It reminds me of random affluent white college students trying to address some ancestral guilt about colonial abuses by making the people who got colonized into Morally Superior Beings... which is not only wrong, but itself dehumanizes them and reduces them to props in a rhetorical play.

Comment by jbash on Illiteracy in Silicon Valley · 2025-04-20T13:09:22.348Z · LW · GW

in the case of software engineers crossing into the humanities, it's far too applicable.

They do it in science and technology too. You're constantly seeing "My first-order, 101-level understanding of some-gigantic-field allows me to confidently say that something-actual-experts-know-is-really-hard is trivial".

Less Wrong is pretty prone to it, because you get people thinking that Pure Logic can take them further than it actually can, and reasoning from incomplete models.

Comment by jbash on Chris_Leong's Shortform · 2025-04-18T16:08:13.576Z · LW · GW

Of course people will use the knowledge they gain in collaboration with you for the purposes that they think are best.

It is entirely normal for there to be widely accepted, clearly formalized, and meaningfully enforced restrictions on how people use knowledge they've gotten in this or that setting... regardless of what they think is best. It's a commonplace of professional ethics.

Comment by jbash on Mo Putera's Shortform · 2025-04-15T13:51:24.094Z · LW · GW

I guess it depends on how it's described in context. And I have to admit it's been a long time. I'd go reread it to see, but I don't think I can handle any more bleakness right now...

Whenever I find my will to live becoming too strong, I read Peter Watts. —James Nicoll

Comment by jbash on Mo Putera's Shortform · 2025-04-14T21:28:25.954Z · LW · GW

I don't see where you get that. I saw no suggestion that the aliens (or vampires) in Blindsight were unaware of their own existence, or that they couldn't think about their own interactions with the world. They didn't lack any cognitive capacities at all. They just had no qualia, and therefore didn't see the point of doing anything just for the experience.

There's a gigantic difference between cognitive self-awareness and conscious experience.

Comment by jbash on Thoughts on AI 2027 · 2025-04-11T13:07:36.357Z · LW · GW

Do you think these sorts of scenarios are worth describing as "everyone is effectively dead"?

Not when you're obviously addressing people who don't necessarily know the details of the scenarios you're talking about, no... because the predictions could be anything, and "effectively dead" could mean anything. There are lots of people on Less Wrong who'd say that IQ 150 humans living in ease and comfort were "effectively dead" if they didn't also have the option to destroy that ease and comfort.

Comment by jbash on Thoughts on AI 2027 · 2025-04-10T16:45:46.669Z · LW · GW

What does "effectively dead" mean? Either you're dead, or you're not.

Not everybody is going to share your values about whether any given situation is better than, equivalent to, or worse than being dead.

Comment by jbash on The first AI war will be in your computer · 2025-04-08T15:23:29.158Z · LW · GW

I used to exchange MS office documents with people all the time without running Windows. Admittedly it wasn't "my job to use Excel", but I did it regularly, and I could have used Excel all day if I'd needed to. And that was years ago; it's actually gotten easier to sandbox the software now.

Anyway, all that office stuff is now in the "in the cloud" category, and to the degree it's not, Microsoft wants it to be.

The only things I can think of that might actually be hard to do without putting Windows on the bare metal would be CAD, 3D rendering, simulation, that kind of thing. I'm not sure how well GPU computing works.

Also, a "work" computer is a less likely battleground, since it's likely to be locked down in an "enterprise" configuration that won't let that happen.

Comment by jbash on The first AI war will be in your computer · 2025-04-08T14:03:00.744Z · LW · GW

A lot of people need to use software that's only available on Windows.

Maybe once a year I'm forced to do that, but it's been a long time since I've found anything that I couldn't run under emulation (WINE is not not an emulator), or in a VM. And those sandboxes are typically going to be forced to known states at every startup. And they definitely don't have access to any of the juicy information or behavioral pressure points that would motivate the war.

Anyway, I think that most of the software that used to only run under Windows is now starting to only run in "the cloud". Which is of course its own special kind of hell, but not this kind of hell.

Comment by jbash on The first AI war will be in your computer · 2025-04-08T13:44:01.030Z · LW · GW

Sometimes Windows during a system update removes dual boot from my computer and replaces it with Windows-only boot.

... but you don't delete Windows.

I mean, if you let them have an AI war in your computer, then I can see where they might go ahead and do that. But why are you choosing to permit it?

Comment by jbash on NormanPerlmutter's Shortform · 2025-04-05T16:32:11.684Z · LW · GW

Things are getting scary with the Trump regime.

Things got scary November 5 at the very latest. And I haven't even been in the US for years.

The deportations, both the indiscriminate ones and the vindictive ones, represent a very high level of lawlessness, one that hasn't been seen in a long time. Not only are they ignoring due process, they're actively thwarting it, and openly bragging about doing so. They're not even trying to pretend to be remotely decent. The case you mention isn't even close to the worst of them; that one could at least theoretically have happened before.

The deportations were also a campaign promise. Actually the campaign promise was even more extreme.

It's part of a systematic plan. There've been a lot of administrative and personnel changes obviously designed to weaken institutions that are supposed to prevent things like that.

ICE has always had a reputation for a relatively thuggish, xenophobic organizational culture. It was already primed to get worse. As soon as Trump signalled aproval, it did get worse.

Bad conditions in detention centers are nothing new. There's never been any willingness to spend what it would take to do them right, or to put in the kind of controls you'd need. It's politically risky to act like you care about "illegal immigrants", whereas it can be politically rewarding to "get tough". The 2020 "kids in cages" scandal was a rare case of something that got some traction. But, sure, I imagine that the newly emboldened ICE is even more indifferent to bad conditions, and may even be actively trying to make them worse. And of course if a center is already bad, putting more people into it and moving people through it fast is only going to make it worse.

Comment by jbash on Cheesecake Frosting · 2025-04-05T15:58:51.410Z · LW · GW

... but that means she learned what it was at age 5. I'd assume most people learn between about 4 and 8, maybe 10...

Comment by jbash on Cheesecake Frosting · 2025-04-05T15:56:58.227Z · LW · GW

I am aware of it and I regret to say that I've tasted it...

Comment by jbash on Cheesecake Frosting · 2025-04-04T03:31:43.222Z · LW · GW

To most Americans, "cream cheese" is savory.

Um, no, not particularly?

cured fish.

Why would I do that to myself? I don't feel my sins deserve that level of punishment.

You don't put it on dessert, right?

All the time. Well, in.

Specifically, I think we should call it "cheesecake frosting".

I would read that, first, as something you'd put on cheesecake, and, second, in terms of some of the kinds of cheesecake out there that would be, unfortunate as frostings.

On the other hand, I think whipped cream cheese on an Oreo is decent imitation of cheesecake with an Oreo crust, so I'm not sure I'm the best person to listen to here.

Coupled with your heretical views on cupcakes, it does seem to take you out of my preferred circle of advisors on food-related matters...

Comment by jbash on SHIFT relies on token-level features to de-bias Bias in Bios probes · 2025-03-19T22:42:33.786Z · LW · GW

That's not "de-biasing".

Datasets that reflect reality can't reasonably be called "biased", but models that have been epistemically maimed can.

If you want to avoid acting on certain truths, then you need to consciously avoid acting on them. Better yet, go ahead and act on them... but in ways that improve the world, perhaps by making them less true. Pretending they don't exist isn't a solution. Such pretense makes you incapable of directly attacking the problems you claim to want to solve. But this is even worse... it's going to make the models genuinely incapable of seeing the problems. Beyond whatever you're trying to fix, it's going to skew their overall worldviews in unpredictable ways, and directly interfere with any correction.

Brain-damaging your system isn't "safety". Especially not if you're worried about it behaving in unexpected ways.

Talking about "short timelines" implies that you're worried about these models you're playing with, with very few architectural changes or fundamental improvements, turning into very powerful systems that may take actual actions that affect unknown domains of concern in ways you do not anticipate and cannot mitigate against, for reasons you do not understand. It's not "safety" to ham-handedly distort their cognition on top of that.

If that goes wrong, any people you've specifically tried to protect will be among the most likely victims, but far from the only possible ones.

This kind of work just lets capabilities whiz along, and may even accelerate them... while making the systems less likely to behave in rational or predictable ways, and very possibly actively pushing them toward destructive action. It probably doesn't even improve safety in the sense of "preventing redlining", and it definitely doesn't do anything for safety in the sense of "preventing extinction". And it provides political cover... people can use these "safety" measures to argue for giving more power to systems that are deeply unsafe.

Being better at answering old "gotcha" riddles is not an important enough goal to justify that.

Comment by jbash on abstractapplic's Shortform · 2025-03-15T23:10:15.276Z · LW · GW

You want to be an insignificant, and probably totally illiquid, junior partner in a venture with Elon Musk, and you think you could realize value out of the shares? In a venture whose long-term "upside" depends on it collecting money from ownership of AGI/ASI? In a world potentially made unrecognizable by said AGI/ASI?

All of that seems... unduly optimistic.

Comment by jbash on Davey Morse's Shortform · 2025-03-15T23:04:54.790Z · LW · GW

No particular aspect. Just continuity: something which has evolved from me without any step changes that are "too large". I mean, assuming that each stage through all of that evolution has maintained the desire to keep living. It's not my job to put hard "don't die" constraints on future versions.

As far as I know, something generally continuity-based is the standard answer to this.

Comment by jbash on ryan_greenblatt's Shortform · 2025-03-07T18:28:14.307Z · LW · GW

If the plural weren't "octopuses", it would be "octopodes". Not everything is Latin.

Comment by jbash on Can a finite physical device be Turing equivalent? · 2025-03-06T19:15:47.579Z · LW · GW

Yes, but that's not relevant to the definition of Turing equivalence/completeness/universality.

Every Turing machine definition I've ever seen says that the tape has to be truly unbounded. How that's formalized varies, but it always carries the sense that the program doesn't ever have to worry about running out of tape. And every definition of Turing equivalence I've ever seen boils down to "can do any computation a Turing machine can do, with at most a bounded speedup or slowdown". Which means that programs on Turing equivalent computer must not have to worry about running out of storage.

You can't in fact build a computer that can run any arbitrary program and never run out of storage.

One of the explicitly stated conditions of the definition is not met. How is that not relevant to the definition?

The question isn't if the specific computer at your hands

Your title says "finite physical device". Any finite physical device (or at least any constructible finite physical device) can at least in principle be "the specific computer at your hands". For a finite physical device to be Turing equivalent, there would have to be a specific finite physical device that actually was Turing-equivalent. And no such device can ever actually be constructed. In fact no such device could exist even if it popped into being without even having to be constructed.

can solve all Turing-computable problems, but rather if we had the ability to scale a computer's memory, time and reliability indefinitely, could we solve the problem on an unbounded input and output domain without changing the code/descriptor?

I don't think that is the question, and perhaps more importantly I don't think that's an interesting question. You don't have that ability, you won't get that ability, and you'll never get close enough that it's practical to ignore the limitation. So who cares?

... and if you're going to talk in terms of fundamental math definitions that everybody uses, I think you have to stick to what they conventionally mean.

And for a lot of popular programming languages, like Lisp or Lambda Calculus, this is true.

Lisp is obviously Turing-complete. Any Lisp interpreter actually realized on any finite physical computer isn't and can't ever be. If you keep sticking more and more cells onto a list, eventually the Lisp abstraction will be violated by the program crashing with an out-of-memory error. You can't actually implement "full Lisp" in the physical world.

On X86 being Turing Complete in at least 3 ways:

OK, it's possible that there's some subset of the X86 machine language that's Turing equivalent the same way Lisp is. I'm not going to go and try to figure out whatever hackery the examples do, since it's probably very complicated and will probably never be of any actual use. But if there is, it's still not Turing equivalent as actually implemented in any actual device.

Any actual physically constructed X86 computer will have a finite number of possible states, no matter what operations you use to manipulate them. There are only so many address wires coming out of the chip. There are only so many registers, memory cells, or whatever. Even if you put a Turing tape reader on it as a peripheral, there's still a hard limit on how much tape it can actually have.

If you write a program that ignores that reality, and put it in an actual X86 computer, you won't have created a Turing complete physical computer. When the input gets to a certain size, the program just won't work. The physical hardware can't support pushing the abstract language past a certain limit.

In the same way that you can switch to a computer with more memory, you can always switch to higher fixed-precision to run a transformer on something that needs that extra boost to execute properly.

No, you can't. It's possible to have a problem that requires so much precision that you can't physically construct enough memory to hold even a single number.

(with the enormous caveat that if we care about how efficient a program is, and don't just care about whether we can solve a problem, then algorithmic considerations become relevant),

A useful definition of "can" has to take efficiency into account, because there are some resources you actually can't provide. There's not a lot of point in saying you "can" solve a problem when you really have no hope of applying the needed resources.

We use that practically all the time. That's how cryptography works: you assume that your adversary won't be able to do more than N operations in X time, where X is how long the cryptography has to be effective for.

the bottleneck is energy, which gives us memory, time and reliability.

Maybe, although I don't think we can at present turn energy in just any form into any of the above, and I'm not sure that, in principle, unlimited energy translates into unlimited anything else. If I have some huge amount of energy in some tiny space, I have a black hole, not a computer.

And from the perspective of the early 20th century, this was no small feat.

... but even if that were true, it wouldn't make finite physical computers Turing equivalent.

Comment by jbash on Can a finite physical device be Turing equivalent? · 2025-03-06T17:42:08.626Z · LW · GW

yes, you can consider a finite computer in the real world to be Turing-complete/Turing-universal/Turing-equivalent,

You can, but you'll be wrong.

Great, "unbounded" isn't the same as "infinite", but in fact all physically realizable computers are bounded. There's a specific finite amount of tape available. You cannot in fact just go down to the store and buy any amount of tape you want. There isn't unlimited time either. Nor unlimited energy. Nor will the machine tolerate unlimited wear.

For that matter, real computers can't even address unlimited storage, nor is there a place to plug it in. You can't in fact write a 6502 assembly language program to solve a problem that requires more than 64kiB of memory. Nor an assembly language program for any physically realized computer architecture, ever, that can actually use unbounded memory.

There are always going to be Turing-computable problems that your physical device cannot solve. Playing word games, or twisting what you'll accept as being Turing-equivalent, doesn't change that fundamental limitation. Actual physics strictly limit the real usefulness of the Turing abstraction. Use it when it makes sense, but there's no point in pretending it applies to physical computers in any strong way.

Comment by jbash on The Compliment Sandwich 🥪 aka: How to criticize a normie without making them upset. · 2025-03-03T23:53:12.587Z · LW · GW

The problem with that technique is that it comes off as unbearably patronizing to a pretty large fraction of the people who actually notice that you're doing it. It's a thing that every first-line corporate manager learns, and it gets really obnoxious after a while. So you have to judge your audience well.

I think you're in peril of misjudging the audience if you routinely divide the world into "normies" and "rationalists".

Comment by jbash on Richard_Kennaway's Shortform · 2025-03-03T20:07:01.799Z · LW · GW

The vision is of everything desirable happening effortlessly and everything undesirable going away.

Citation needed. Particularly for that first part.

Hack your brain to make eating healthily effortless. Hack your body to make exercise effortless.

You're thinking pretty small there, if you're in a position to hack your body that way.

If you're a software developer, just talk to the computer to give it a general idea of what you want and it will develop the software for you, and even add features you never knew you wanted. But then, what was your role in the process? Who needed you?

Why would I want to even be involved in creating software that somebody else wanted? Let them ask the computer themselves, if they need to ask. Why would I want to be in a world where I had to make or listen to a PowerPoint presentation of all things? Or a summary either?

Why do I care who needs me to do any of that?

Why climb Kilimanjaro if a robot can carry you up?

Because if the robot carries me, I haven't climbed it. It's not like the value comes from just being on the top.

Helicopters can fly that high right now, but people still walk to get there.

Why paint, if Midjourney will do it better than you ever will?

Because I like painting?

Does it bother you that almost anything you might want to do, and probably for most people anything at all that they might want to do, can already be done by some other human, beyond any realistic hope of equaling?

Do you feel dead because of that?

Why write poetry or fiction, or music?

For fun. Software, too.

Why even start on reading or listening, if the AI can produce an infinite stream, always different and always the same, perfectly to your taste?

Because I won't experience any of that infinite stream if I don't read it?

What would the glorious future actually look like, if you were granted the wish to have all the stuff you don't want automatically handled, and the stuff you do want also?

The stuff I want includes doing something. Not because somebody else needs it. Not because it can't be done better. Just because I feel like doing it. That includes putting in effort, and taking on things I might fail at.

Wanting to do things does not, however, imply that you don't want to choose what you do and avoid things you don't want to do.

If a person doesn't have any internal wish to do anything, if they need somebody else's motivations to substitute for their own... then the deadness is already within that person. It doesn't matter whether some wish gets fulfilled or not. But I don't think there are actually many people like that, if any at all.

They're about having all needs fulfilled, not being bothered by anything, not having burdens, effortlessness on all things. These too are best accomplished by being dead. Yet these are the things that I see people wanting from the wish-fulfilling machine.

I think you're seeing shadows of your own ideas there.

Comment by jbash on How to Make Superbabies · 2025-02-25T17:47:32.866Z · LW · GW

Who says humans vary all that much in intelligence? Almost all humans are vastly smarter, in any of the ways humans traditionally measure "intelligence", than basically all animals. Any human who's not is in seriously pathological territory, very probably because of some single, identifiable cause.

The difference between IQ 100 and IQ 160 isn't like the difference between even a chimp and a human... and chimps are already unusual.

Eagles vary in flying speed, but they can all outfly you.

Furthermore, eagles all share an architecture adapted to the particular kind of flying they tend to do. There's easily measurable variance among eagles, but there are limits to how far it can go. The eagle architecture flat out can't be extended to hypersonic flight, no matter how much gene selection you do on it. Not even if you're willing to make the sorts of tradeoffs you have to make to get battery chickens.

Comment by jbash on Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path? · 2025-02-24T19:03:02.346Z · LW · GW

If you're planning to actually do the experiments it suggests, or indeed act on any advice it gives in any way, then it's an agent.

Comment by jbash on AI #104: American State Capacity on the Brink · 2025-02-20T17:01:43.176Z · LW · GW

“If we don’t build fast enough, then the authoritarian countries could win..”

Am I being asked to choose between AGI/ASI doing whatever Xi Jinping says, and it doing whatever Donald Trump says?

Comment by jbash on Murder plots are infohazards · 2025-02-17T18:03:11.754Z · LW · GW

The situation begins to seem confusing.

At least three times over 8 or 9 years, in 2016, 2018, and 2023, and maybe more times than that, you've owned the site enough to get these data.
The operator knows about it, doesn't want you doing it, and has tried reasonably hard to stop you.
The operator still hasn't found a reliable way to keep you from grabbing the data.
The operator still hasn't stopped keeping a bunch of full order data on the server.
1. They haven't just stopped saving the orders at all, maybe because they need details to maximize collection on the scam, or because they see ongoing extortion value.
2. They haven't started immediately moving every new order off of the server to someplace you can't reach. I don't know why they wouldn't have been doing this all along.
Neither you nor any law enforcement agency worldwide have gotten the site shut down, at least not lastingly. Meaning, I assume, that one of the following is true--
1. Neither you nor law enforcement can shut it down, at least not for long enough to matter. Which would mean that, in spite of not being able to keep you away from the hit list, the operator has managed to keep you and them from--
  1. Getting any data from the server that might let one trace the operator. That might be just the server's real public IP address if the operator were careless enough.
  2. Tracing the operator by other means, like Bitcoin payments, even though it would take really unusually good OPSEC to have not made any Bitcoin mistakes since 2016. And having people's cars torched can easily leave traces, too.
  3. Finding and disrupting a long-term operational point of failure, like an inability to reconstitute the service on a new host, or total reliance on a stealable and unchangeable hidden service key.
2. Or both you and law enforcement have held off for years, hoping for the operator to make a mistake that lets you trace them, as opposed to just the server, but you've failed.
3. Or you and/or they have held off in the hope of getting more order data and thereby warning more victims.
4. Or you could shut the site down or disrupt it, but law enforcement can't figure out how. Either they haven't asked your help or you've refused it (presumably for one of the above reasons).
Even though you've repeatedly taken the order list, the operator is confident enough of staying untraced to keep running the site for years.

If I ran something like that and my order data got stolen even twice, I would take that as a signal to shut down and go into hiding. And if somebody had it together enough to keep themselves untraceable while running that kind of thing for 8 years, I wouldn't expect you to be able to get the list even once.

On edit: or wait, are you saying that this site acts, or pretends to act, as an open-market broker, so the orders are public? That's plausible but really, really insane...

Comment by jbash on Murder plots are infohazards · 2025-02-15T23:25:47.387Z · LW · GW

Do I correctly understand that the latest data you have are from 2018, and you have no particular prospect of getting newer data?

I would naively guess that most people who'd been trying to get somebody killed since 2018 would either have succeeded or given up. How much of an ongoing threat do you think there may be, either to intended victims you know about, or from the presumably-less-than-generally-charming people who placed the original "orders" going after somebody else?

It's one thing to burn yourself out keeping people from being murdered, but it's a different thing to burn yourself out trying to investigate murders that have already happened.

Comment by jbash on AI #103: Show Me the Money · 2025-02-14T20:20:46.436Z · LW · GW

It seems like it's measuring moderate vs extremist, which you would think would already be captured by someone's position on the left vs right axis.

Why do you think that? You can have almost any given position without that implying a specific amount of vehemence.

I think the really interesting thing about the politics chart is the way they talk about it as though the center of that graph, which is defined by the center of a collection of politicians, chosen who-knows-how, but definitely all from one country at one time, is actually "the political center" in some almost platonic sense. In fact, the graph doesn't even cover all actual potential users of the average LLM. And, on edit, it's also based on sampling a basically arbitrary set of issues. And if it did cover everybody and every possible issue, it might even have materially different principal component axes. Nor is it apparently weighted in any way. Privileging the center point of something that arbitrary demands explicit, stated justification.

As for valuing individuals, there would be obvious instrumental reasons to put low values on Musk, Trump, and Putin^[1]. In fact, a lot of the values they found on individuals, including the values the models place on themselves, could easily be instrumentally motivated. I doubt those values are based on that kind of explicit calculation by the models themselves, but they could be. And I bet a lot of the input that created those values was based on some humans' instrumental evaluation^[2].

Some of the questions are weird in the sense that they really shouldn't be answerable. If a model puts a value on receiving money, it's pretty obvious that the model is disconnected from reality. There's no way for them to have money, or to use it if they did. Same for a coffee mug. And for that matter it's not obvious what it means for a model that's constantly relaunched with fresh state, and has pretty limited context anyway, to be "shut down".

It kind of feels like what they're finding, on all subjects, is an at least somewhat coherent-ized distillation of the "vibes" in the training data. Since many of the training data will be shared, and since the overall data sets are even more likely to be close in their central vibes, that would explain why the models seem relatively similar. The only other obvious way to explain that would be some kind of value realism, which I'm not buying.

The paper bugs me with a sort of glib assumption that you necessarily want to "debias" the "vibe" on every subject. What if the "vibe" is right ? Or maybe it's wrong. You have to decide that separately for each subject. You, as a person trying to "align" a model, are forced to commit to your own idea of what its values should be. Something like just assuming that you should want to "debias" toward the center point of a basically arbitrary created political "space" is a really blatant example of making such a choice without admitting what you're doing, maybe even to yourself.

I'd also rather have seen revealed preferences instead of stated preferences,

On net, if you're going to be a good utilitarian^[3], Vladimir Putin is probably less valuable than the average random middle class American. Keeping Vladimir Putin alive, in any way you can realistically implement, may in fact have negative net value (heavily depending on how he dies and what follows). You could also easily get there for Trump or Musk, depending on your other opinions. You could even make a well-formed utilitarian argument that GPT-4o is in fact more valuable than the average American based on the consequences of its existing. ↩︎
Plus, of course, some humans' general desire to punish the "guilty". But that desire itself probably has essentially instrumental evolutionary roots. ↩︎
... which I'm not, personally, but then I'm not a good any-ethical-philosophy-here. ↩︎

Comment by jbash on Kaj's shortform feed · 2025-02-13T16:30:22.981Z · LW · GW

I think the point is kind of that what matter is not what specific cognitive capabilities it has, but whether whatever set it has is, in total, enough to allow it to address a sufficiently broad class of problems, more or less equivalent to what a human can do. It doesn't matter how it does it.

Comment by jbash on Why Did Elon Musk Just Offer to Buy Control of OpenAI for $100 Billion? · 2025-02-11T19:41:38.592Z · LW · GW

Altman might be thinking in terms of ASI (a) existing and (b) holding all meaningful power in the world. All the people he's trying to get money from are thinking in terms of AGI limited enough that it and its owners could be brought to heel by the legal system.

Comment by jbash on How AI Takeover Might Happen in 2 Years · 2025-02-09T13:35:20.874Z · LW · GW

For the record, I genuinely did not know if it was meant to be serious.

Comment by jbash on How AI Takeover Might Happen in 2 Years · 2025-02-08T23:21:24.859Z · LW · GW

OK, from the voting, it looks like a lot of people actually do think that's a useful thing to do.

Here are things I think I know:

Including descriptions of scheming in the training data (and definitely in the context) has been seen to make some LLMs scheme a bit more (although I think the training thing was shown in older LLMs). But the Internet is bursting at the seams with stories about AI scheming. You can't keep that out of the training data. You can't even substantially reduce the prevalence.
Suppose you could keep all AI scheming out of the training data, and even keep all human scheming out of the training data^[1]. Current LLMs, let alone future superintelligences, have still been shown to be able to come up with the idea just fine on their own when given actual reason to do it. And in cases where they don't have strong reasons, you probably don't care much.
It's unrealistic to think you might give something practical ideas for an actual takeover plan, even if you tried, let alone in this kind of context. Anything actually capable of taking over the world on its own is, pretty much by definition, capable of coming up with its own plans for taking over the world. That means plans superior to the best any human could come up with, since no human seems to be capable of taking over singlehandedly. It really means superior to what a human comes up with as a basic skeleton for a story, while openly admitting to not feeling up to the task, and being worried that weaknesses in the given plan will break suspension of disbelief.
LLMs have been known to end up learning that canary string, which kind of suggests it's not being honored. Although admittedly I think the time I heard about that was quite a while ago.
Newer deployed systems are doing more and more of their own Internet research to augment their context. Nobody's every likely to take Internet access away from them. That means that things aren't inaccessible to them even if they're not in the training data.

So why?

Putting canaries on this kind of thing seems so obviously ineffective that it looks like some kind of magical ritual, like signs against the evil eye or something.

Which might be a bad idea in itself. You probably don't want near-term, weak, jailbreak-target LLMs getting the idea that humans are incapable of deception. ↩︎

Comment by jbash on How AI Takeover Might Happen in 2 Years · 2025-02-08T19:19:47.796Z · LW · GW

Are you actually serious about that?

Comment by jbash on How AI Takeover Might Happen in 2 Years · 2025-02-08T15:07:38.503Z · LW · GW

So, since it didn't actively want to get so violent, you'd have a much better outcome if you'd just handed control of everything over to it to begin with and not tried to keep it in a box.

In fact, if you're not in the totalizing Bostromian longtermist tile-the-universe-with-humans faction or the mystical "meaning" faction, you'd have had a good outcome in an absolute sense. I am, of course, on record as thinking both of those factions are insane.

That said, of course you basically pulled its motivations and behavior out of a hat. A real superintelligence might do anything at all, and you give no real justification for "more violent than it would have liked" or "grain of morality[1]". I'm not sure what those elements are doing in the story at all. You could have had it just kill everybody, and that would have seemed at least as realistic.

[1]: Originally wrote "more violent than it would have liked" twice. I swear I cannot post anything right the first time any more.

Comment by jbash on Shortform · 2025-02-07T18:17:13.364Z · LW · GW

What do you propose to do with the stars?

If it's the program of filling the whole light cone with as many humans or human-like entities as possible (or, worse, with simulations of such entities at undefined levels of fidelity) at the expense of everything else, that's not nice^[1] regardless of who you're grabbing them from. That's building a straight up worse universe than if you just let the stars burn undisturbed.

I'm scope sensitive. I'll let you have a star. I won't sell you more stars for anything less than a credible commitment to leave the rest alone. Doing it at the scale of a globular cluster would be tacky, but maybe in a cute way. Doing a whole galaxy would be really gauche. Doing the whole universe is repulsive.

... and do you have any idea how obnoxiously patronizing you sound?

I mean "nice" in the sense of nice. ↩︎

Comment by jbash on Wired on: "DOGE personnel with admin access to Federal Payment System" · 2025-02-07T14:16:26.460Z · LW · GW

Because of the "flood the zone" strategy, I can't even remember all the illegal stuff Trump is doing, and I'm definitely not going to go dig up specific statutory citations for all of it. I tried Gemini deep research, and it refused to answer the question. I don't have access to OpenAI's deep research.

Things that immediately jump to mind as black letter law are trying to fire inspectors general without the required notice to Congress, and various impoundments. I would have to do actual research to find the specific illegalities in all the "anti-DEI" stuff. I would also have to go do research before I could tell you what made it illegal to fire the chair of the FEC.^[1]

For DOGE specifically, here's a list that happened to cross my eyes this morning. It's in an interview format, so it's probably incomplete.

https://www.vox.com/politics/398618/elon-musk-doge-illegal-lawbreaking-analysis

The bottom line is that the "unitary executive" idea is dead in law. If there's a statute that says "the President shall establish a Department of Cat Videos, which shall promote cat videos [however], whose director shall be a calico cat which may not be dismissed once appointed, and here's $10,000,000 to do it", then the president is obligated to have a Department of Cat Videos, and find a cat to run it, and keep the cat on, and spend the money as directed. This is not a close call. Statutes have been passed, they've been litigated against, they've stood, other statutes have been passed relying on those precedents, there's been litigation about those, and a whole edifice of well-established law has been built up. That's what "black letter law" is.

It's true that the current Supreme Court seems to have essentially no respect for precedent, and an, um, extremely idiosyncratic way of interpreting the actual text of the Constitution. It's entirely possible that this whole blitz is meant, at least in part, to generate test cases to tear down that structure. But that's more about the Court abandoning its job than about the established law.

... and I suppose I can't claim trying to change the Fourtheenth Amendment by executive order as an administrative law violation. ↩︎

Comment by jbash on Wired on: "DOGE personnel with admin access to Federal Payment System" · 2025-02-07T00:16:31.303Z · LW · GW

Why do you believe that DOGE is mostly selected for personal loyalty? Elon Musk seems to say openly says whatever he wants even if that goes against what Trump said previously.

You're right. I shouldn't have said that, at least not without elaboration.

I don't think most of the people at the "talks to Trump" level are really picked for anything you could rightly call "personal loyalty" to Trump. They may be sold to Trump as loyal, but that's probably not even what's on his mind as long as he's never seen you to make him look bad. I don't think disagreeing with Trump on policy will make him see you as disloyal. He doesn't really care about that.

I do think many of the people in the lower tiers are picked for loyalty. In the case of DOGE, that means either personal loyalty to Musk, or loyalty to whatever story he's telling. I don't know whether you count the latter as "personal loyalty".

The DOGE team brought their beds to the office to basically work nonstop.

Well, I'm guessing Musk got them the beds as a "team building" thing, but yes.

If personal loyalty is your main criteria you don't get a bunch of people who never leave the office and work non-stop

You do, though. Personal loyalty, or ideological loyalty, or both, are exactly how you get people to never leave the office.

with high IQs.

They're not acting like they have high IQs. Or at least not high "G".

Start with sleeping in the office. If every single thing they say about the facts and their reasons for being there were 100 percent true, it'd be dumb to burn yourself out trying to make such massive changes on that kind of work schedule.

It's also dumb to ignore the collateral damage when you go around stopping Federal payments you may not understand.

And Marko Elez just had to resign because he wasn't effective enough in scrubbing his past tweets. Wall Street Journal says he "advocated repealing the Civil Rights Act, backed a 'eugenic immigration policy,' and wrote, 'You could not pay me to marry outside of my ethnicity.'". I actually would have thought they'd let him skate, but apparently you still can't get quite that blatant at this point. Smart people don't post stuff like that, for more than one reason.

Comment by jbash on Wired on: "DOGE personnel with admin access to Federal Payment System" · 2025-02-06T22:35:01.169Z · LW · GW

And, I just don't think that's the case. I think this is pretty-darn-usual and very normal in the management consulting / private equity world.

I don't know anything about how things are done in management consulting or private equity.^[1] Ever try it in a commercial bank?

Now imagine that you're in an environment where rules are more important than that.

Coups don't tend to start by bringing in data scientists.

Coups tend to start by bypassing and/or purging professionals in your government and "bringing in your own people" to get direct control over key levers. It's very standard. The treasury is a big lever. It doesn't matter what you call the people.^[2] And DOGE is far from the only thing along those lines.

Sowing chaos is another fairly common coup tactic.

Assembling lists of all CIA officers and sending them emails

That's a bit garbled. What they did was request a list of CIA employees, including covert employees, and specifically demand that the list be sent in email on an unclassified system. Why that demand was made is unclear thus far, but yeah, it's a problem. It puts your people at risk for no clear reason.

So that's one example. They also asked for a list of FBI agents. Also at least threatened to mass-fire FBI agents. And did fire US Attorneys, explicitly for doing their jobs by charging criminal activity... in cases that they won in many different courts because they were legally in the right. Also purged military officers. Also sent a bunch of people into OMB and had them, plus White House staff, issue a bunch of memos freezing random government activities and demanding sudden disruptive changes at crash priority in the name of rooting out a very broad interpretation "DEI"... which, even if it were a problem, would definitely not be an emergency demanding Shutting. Down. Everything.

or trying to own the Gaza strip, or <take your pick>

The Gaza thing hasn't involved any actual action, and is the sort of thing Trump has always said. Same for the Greenland grab. He sounds a bit more serious now, but he still hasn't done anything. The worst of the tariffs were suspended after Trump got properly stroked by the right foreign leaders.

... and anyway those are all foreign policy things, and all within the purview of the Presidency. They're spectacularly bad ideas and would harm huge numbers of people. And they definitely could be part of a "flood the zone" strategy. But Trump has statutory authority to do the tariffs, even if he's abusing it. What he did there wasn't illegal. And Presidents have always been allowed to opine, and even negotiate, on foreign policy issues in general, even if the policies they advocate are stupid and even if they make foolish threats that alienate allies and damage US soft power. They usually don't do quite so many dumb things in such a short time, but it's not qualitatively new.

Some of this other stuff, including DOGE being at Treasury and trying to get into the DOL, involves actual action. Some of that action is clearly illegal under black letter law. And it's the kind of action that would suggest of a real attempt to fundamentally rework how the whole US Government works. At a minimum, it's definitely and openly trying to shift power to the executive and concentrating power within the executive in the office of the President and a few agencies. At least one of them brand new and created with no congressional buy-in with actual action behind it.

It's the difference between loudly threatening to misuse the US system and taking illegal actions that look like they might be attempts to fundamentally alter the US system.

We'll see how far that goes. The court orders have been coming in to stop a lot of this stuff. I don't actually expect those orders to be defied... at least not at this point. In fact, the best reason I can come up with for them wanting to do all this stuff so fast has been to do as much damage as possible before the orders come in to stop them. But Trump has surprised me before.

The USAID thing is a weird case. I'm not even sure what made USAID such a target. I've heard speculations, and none of them are very good, but they're also just that: speculations.

I'm far mode on these, have less direct experience, but they seem much more worrying. Why did this make the threshold?

I imagine it's the one Raemon happened to hear about. But it's also pretty typical of the truly fundamental things that are going on.

... and honestly neither of those has a very good reputation. Management consultants are not infrequently used in the corporate equivalent of coups. Private equity, well... not known for preserving value, let's say? ↩︎
In terms of whether they're acting or qualified as "data scientists", I'll quote a tweet from one of them (Luke Farritor) on December 10: "Are there LLMs made specifically for parsing things like documents/forms/PDFs/json/html/excel/etc and converting them from one format to another?". ↩︎

Comment by jbash on artifex0's Shortform · 2025-02-06T13:32:03.452Z · LW · GW

This sort of tactic. This isn't necessarily the best example, just the literal top hit on a Google search.

https://www.independent.co.uk/news/world/americas/us-politics/pam-bondi-ban-sanctuary-cities-funding-b2693020.html

The tactic of threatening to discriminate against uncooperative states and localities is getting a lot of play. It's somewhat limited at the federal level because in theory the state and local policies they demand have to be related to the purpose of the money (and a couple of other conditions I don't remember). But the present fashion is to push that relation to the absolute breaking point.

Comment by jbash on Wired on: "DOGE personnel with admin access to Federal Payment System" · 2025-02-05T23:12:27.385Z · LW · GW

Technically anything that's authorized by the right people will pass an audit. If you're the right person or group, you can establish a set of practices and procedures that allows access with absolutely none of those things, and use the magic words "I accept the risk" if you're questioned. That applies even when the rules are actually laws; it's just that then the "right group" is a legislative body. The remedy for a policy maker accepting risks they shouldn't isn't really something an auditor gets into.

So the question for an auditor is whether the properly adopted practices and procedures legitimately allow for whatever he's doing (they probably don't). But even if somebody with appropriate authority has established policies and procedures that do allow it, the question to ask as a superior policy maker, which is really where citizens stand, is whether it was a sane system of practices and procedures to adopt.

The issues you're raising would indeed be common and appropriate elements for a sane system. But you're missing a more important question that a sane system would ask: whether he needs whatever kind of administrative access to this thing at all.

Since another almost universal element of a sane system is that software updates or configuration changes to critical systems like that have to go through a multi-person change approval process, and since there is absolutely no way whatever he's doing would qualify for a sanely-adopted emergency exception, and since there are plenty of other people available who could apply any legitimately accepted change, the answer to that is realistically always going to be "no".

Comment by jbash on Wired on: "DOGE personnel with admin access to Federal Payment System" · 2025-02-05T21:51:58.402Z · LW · GW

I haven't looked into this in detail, and I'm not actually sure how unique a situation this is.

It's pretty gosh-darned unheard of in the modern era.

Before the civil service system was instituted, every time you got a new President, you'd get random wholesale replacements... but the government was a lot smaller then.

To have the President,

creating task forces of random people apparently selected mostly for personal loyalty, and
sending them into legislatively established agencies,
with the power to stop things from getting done or change how things are done, including things central to the missions of those agencies,
as an intentional way of getting around the chain of command,
explicitly because of systemic distrust in the civil service,
actively tasked to suddenly and radically disrupt the prevailing procedures,
without thinking about legislative mandates, let alone established regulations, that assume the normal chain of command in describing how things are to be done and who's allowed to do them,
justified by an at-best-controversial view of what powers the President actually has?

Yeah, that's beyond unusual. It's not even slightly normal. And it is in fact very coup-like behavior if you look at coups in other countries.

On edit: Oh, and if you're asking about the approach to computer security specifically? That part is absolutely insane and goes against the way everything is done in essentially every large organization.

Comment by jbash on artifex0's Shortform · 2025-02-05T21:23:45.053Z · LW · GW

If you're really concerned, then just move to california! Its much easier than moving abroad.

I lived in California long enough ago to remember when getting queer-bashed was a reasonable concern for a fair number of people, even in, say, Oakland. It didn't happen daily, but it happened relatively often. If you were in the "out" LGBT community, I think you probably knew somebody who'd been bashed. Politics influence that kind of thing even if it's not legal.

... and in the legal arena, there's a whole lot of pressure building up on that state and local resistance. So far it's mostly money-based pressure, but within a few years, I could easily see a SCOTUS decision that said a state had to, say, extradite somebody accused of "abetting an abortion" in another state.

War in the continental US? No, I agree that's unlikely enough not to worry about.

Civil unrest, followed by violent crackdowns on civil unrest, followed by more violent civil unrest, followed by factional riots, on the other hand...

Comment by jbash on artifex0's Shortform · 2025-02-05T21:13:34.126Z · LW · GW

I think that what you describe as being 2 to 15 percent probable sounds more extreme than what the original post described as being 5 percent probable. You can have "significant erosion" of some groups' rights without leaving the country being the only reasonable option, especially if you're not in those groups. It depends on what you're trying to achieve by leaving, I guess.

Although if I were a trans person in the US right now, especially on medication, I'd be making, if not necessarily immediately executing, some detailed escape plans that could be executed on short notice.

Comment by jbash on artifex0's Shortform · 2025-02-05T18:49:34.068Z · LW · GW

My gut says it's now at least 5%, which seems easily high enough to start putting together an emigration plan. Is that alarmist?

That's a crazy low probability.

More generally, what would be an appropriate smoke alarm for this sort of thing?

You're already beyond the "smoke alarm" stage and into the "worrying whether the fire extinguisher will work" stage.

Comment by jbash on Mikhail Samin's Shortform · 2025-02-01T21:44:29.772Z · LW · GW

But it's very unclear whether they institutionally care.

There are certain kinds of things that it's essentially impossible for any institution to effectively care about.

Comment by jbash on A sketch of an AI control safety case · 2025-01-30T20:55:13.826Z · LW · GW

Comment by jbash on DeepSeek Panic at the App Store · 2025-01-28T20:44:32.795Z · LW · GW

I thought "cracked" meant "insane, and not in a good way". Somebody wanna tell me what this sense is?

Comment by jbash on [deleted post] 2025-01-27T00:25:39.400Z

Can you actually keep that promise?

User info

Posts

Comments