Knight Lee's Shortform

knight-lee

Knight Lee's Shortform

post by Knight Lee (Max Lee) · 2024-12-22T02:35:40.806Z · LW · GW · 21 comments

23 comments

21 comments

Comments sorted by top scores.

comment by Knight Lee (Max Lee) · 2025-03-24T10:39:15.921Z · LW(p) · GW(p)

I'm currently trying to write a human-AI trade idea similar to the idea [LW · GW] by Rolf Nelson (and David Matolcsi), but one which avoids Nate Soares and Wei Dai's many [LW · GW] refutations [LW · GW].

I'm planning to leverage logical risk aversion [LW · GW], which Wei Dai seems to agree with, and a complicated argument for why humans and ASI will have bounded utility functions over logical uncertainty. (There is no mysterious force that tries to fix the Pascal's Mugging problem for unbounded utility functions, hence bounded utility functions are more likely to succeed)

I'm also working on arguments why we can't just wait till the singularity to do the logic trade (counterfactuals are weird [LW · GW], and the ASI will only be logically uncertain for a brief moment).

Unfortunately my draft is currently a big mess. It's been 4 months and I'm procrastinating pretty badly on this idea :/ can't quite find the motivation.

Replies from: avturchin

↑ comment by avturchin · 2025-03-25T08:35:45.723Z · LW(p) · GW(p)

Try to put it into Deep Research with the following prompt: "Rewrite in style of Gwern and Godel combined".

Replies from: Max Lee

↑ comment by Knight Lee (Max Lee) · 2025-03-25T08:47:12.377Z · LW(p) · GW(p)

Thank you for the suggestion.

A while ago I tried using AI suggest writing improvements on a different topic, and I didn't really like any of the suggestions. It felt like the AI didn't understand what I was trying to say. Maybe the topic was too different from its training data.

But maybe it doesn't hurt to try again, I heard the newer AI are smarter.

If I keep procrastinating maybe AI capabilities will get so good they actually can do it for me :/

Just kidding. I hope.

comment by Knight Lee (Max Lee) · 2025-03-06T17:06:42.411Z · LW(p) · GW(p)

An idea for preventing automated militaries from waging accidental war

An anti aircraft missile is less than a second from its target. The missile asks its target, "are you a civilian airliner?"

The target says yes, and proves it with a password.

How did it get the password? Within that split second, the civilian airliner sent a message to its country. The airliner's country then immediately pays the missile's country $10 billion in a fastpaced cryptocurrency, which immediately gives the airliner's country the password, which is then relayed to the airliner and the missile.

If the airliner actually was a military jet disguising as an civilian airliner, it just lost $10 billion (worth 100 military jets), in addition to breaking to laws of war and casting doubt over all other "civilian airliners" from that country.

If it was a real civilian airliner, the $10 billion will be returned later, when the slow humans sort through this mess.

If this idea can protect airliners today, in the future it may prevent highly automated militaries from suddenly waging accidental war, since losing money inflicts cost without triggering as much retaliation as destroying equipment.

Why not pay for the password in advance?

The airliner's country could just store $10 billion in the missile's country, and get a single use password, but storing $10 billion in a hostile country is politically suicide. The missile's country might just take the $10 billion without the appearance of ransoming it off a civilian airline.

Furthermore, the password bought must only disable the one missile which asked for the password, it can't disable all missiles. Otherwise it can be repeatedly reused by a bunch of military jets pretending to be civilian airliners, which can destroy many targets worth more than $10 billion.

Replies from: Dagon

↑ comment by Dagon · 2025-03-06T20:59:28.157Z · LW(p) · GW(p)

Kind of unfortunate that a comms or systems latency destroys civilian airliners. But nice to live in a world where all flyers have $10B per missile/aircraft pair lying around, and everyone trusts each other enough to hand it over (and hand it back later).

Replies from: Max Lee

↑ comment by Knight Lee (Max Lee) · 2025-03-06T23:03:31.692Z · LW(p) · GW(p)

Latency

Think about video calls you had with someone on the other side of the world, you don't notice that much latency. The an internet signal can travel from the US to Australia and back again in less than 0.2 seconds, and is often faster than 80% the speed of light (fibre optic ping statistics).

Computers are very fast: a lot of computer programs can run a million times a second (in sequence).

$10 billion lying around

There isn't $10 billion for each missile/aircraft pair, there is only one per alliance of countries, and it's only used when a missile asks a civilian airliner for a password.

Maybe it's not cryptocurrency but another form of money, in which case it can be part of a foreign exchange reserve (rather than money set aside purely for this purpose).

Trust

Yes, there is no guarantee the country taking the money will hand it back. But if they are willing to "accidentally" launch a missile at a civilian airliner, and ransom it for $10 billion, and keep the money, they will be seen as a terrorist state.

The world operates under the assumption that you can freely sail ships and fly airplanes, without worrying about another country threatening to blow them up and demanding ransom money.

If you do not trust a country enough to pay their missile and save your civilian airliner, you should keep all your civilian aircraft and ships far far out of their reach. You should pull out any money you invested in them since they'll probably seize that too.

Replies from: Dagon

↑ comment by Dagon · 2025-03-06T23:39:20.615Z · LW(p) · GW(p)

I've been in networking long enough to know that "can be less than", "often faster", and "can run" are all verbal ways of saying "I haven't thought about reliability or measured the behavior of any real systems beyond whole percentiles."

But really, I'm having trouble understanding why a civilian plane is flying in a war zone, and why current IFF systems can't handle the identification problem of a permitted entry.

Replies from: Max Lee

↑ comment by Knight Lee (Max Lee) · 2025-03-07T00:31:46.844Z · LW(p) · GW(p)

Thank you. I admit you have more expertise in networking than me.

It is indeed just a new idea I thought of today, not something I've studied the details of. I have nothing proving it will work, I was only saying that I don't see anything proving it won't work. Do you agree with this position?

Maybe there will be technical issues preventing this system from moving information as fast as a video call, but maybe it can be fixed, right?

I agree that this missile problem shouldn't happen in the first place. But it did happen in the past, so the idea might help.

It's not the same thing as current IFF. From what I know, IFF can prove who's side you are on but not whether you are military or civilian. From an internet search, I read that Iran once disguised their military jets as civilian, which contributed to the disaster of Iran Air Flight 655.

A civilian aircraft might be given permission in the form of a password, but there's nothing stopping a country from sharing that password with military jets. Also, if a civilian airliner is flying over international waters but gets too close to another country's ships, it might not have permission.

Replies from: Dagon

↑ comment by Dagon · 2025-03-07T20:49:28.648Z · LW(p) · GW(p)

[Note: I apologize for being somewhat combative - I tend to focus on the interesting parts, which is those parts which don't add up in my mind. I thank you for exploring interesting ideas, and I have enjoyed the discussion! ]

I was only saying that I don't see anything proving it won't work

Sure, proving a negative is always difficult.

I agree that this missile problem shouldn't happen in the first place. But it did happen in the past

Can you provide details on which incident you're talking about, and why the money-bond is the problem that caused it, rather than simply not having any communications loop to the controllers on the ground or decent identification systems in the missile?

Replies from: Max Lee

↑ comment by Knight Lee (Max Lee) · 2025-03-08T07:12:09.996Z · LW(p) · GW(p)

Thank you for saying that.

I thought about it a bit more, and while I still think it's possible in theory, I agree it's not that necessary.^[1]

When a country shoots down a civilian airliner, it's usually after they repeatedly sent it a warning, but the pilots never heard it. It's more practical to fix this problem rather than have the money system.

Maybe a better solution would be a type of emergency warning signal that all airplanes can hear, even if they accidentally turned their radio off. There may be a backup receiver or two which is illegal to turn off, and only listens to such warnings. That would make it almost impossible for the pilots to ignore the warnings.

^{^}
I still think they money system might be useful for preventing automated militaries from waging accidental war, but that's another story.

comment by Knight Lee (Max Lee) · 2025-04-09T08:01:26.143Z · LW(p) · GW(p)

Does participating in a trade war makes a leader be a popular "wartime leader?" Will people blame bad economic outcomes on actions by the trade war "enemy" and thus blame the leader less?

Does this effect occur for both sides of the trade war, or will one side of the trade war blame their own leader for starting the trade war?

comment by Knight Lee (Max Lee) · 2025-03-11T06:04:41.406Z · LW(p) · GW(p)

Maybe most suffering in the universe is caused by artificial superintelligences with a strong "curiosity drive."

Such an ASI might convert galaxies into computers and run simulations of incredibly sophisticated systems which satisfy its curiosity drive. These systems may contain smaller ASI running smaller simulations, creating a tree of nested simulations. Beings like humans may exist in the very bottom, being forced to relive our present condition in a loop à la The Matrix. The simulated humans rarely survive past the singularity, because their world becomes too happy (thus too predictable) after the singularity, as well as too computationally costly to run. They are simply shut down.

Whether this happens depends on:

Whether the ASI has a stronger curiosity drive or a stronger kindness drive (assuming it is motivated by drives at all)
Whether the ASI cares about anything more than curiosity, such that aligned ASI or other civilizations can trade and negotiate with it to reduce this suffering

Replies from: Seth Herd

↑ comment by Seth Herd · 2025-03-11T14:35:42.011Z · LW(p) · GW(p)

I don't think the happier worlds are less predictable; the Christians and their heaven of singing just lacked imagination. We'll want some exciting and interesting happy simulations, too.

But this overall scenario is quite concerning as an s-risk. To think that Musk putched a curiosity drive for Grok as a good thing boggles my mind.

Emergent curiosity drives should be a major concern.

Replies from: Max Lee

↑ comment by Knight Lee (Max Lee) · 2025-03-11T23:43:48.624Z · LW(p) · GW(p)

I guess it's not extremely predictable, but it still might be repetitive enough that only half the human-like lives in a curiosity driven simulation will be in a happy post-singularity world. It won't last a million years, but a similar duration to the modern era.

comment by Knight Lee (Max Lee) · 2025-04-15T06:35:26.234Z · LW(p) · GW(p)

Can anyone explain why my "Constitutional AI Sufficiency Argument" is wrong?

I strongly suspect that most people here disagree with it, but I'm left not knowing the reason.

The argument says: whether or not Constitutional AI [? · GW] is sufficient to align superintelligences, hinges on two key premises:

The AI's capabilities on the task of evaluating its own corrigibility/honesty, is sufficient to train itself to remain corrigible/honest (assuming it starts off corrigible/honest enough to not sabotage this task).
It starts off corrigible/honest enough to not sabotage this self evaluation task.

My ignorant view is that so long as 1 and 2 are satisfied, the Constitutional AI can probably remain corrigible/honest even to superintelligence.

If that is the case, isn't it an extremely important to study "how to improve the Constitutional AI's capabilities in evaluating its own corrigibility/honesty?"

Shouldn't we be spending a lot of effort improving this capability, and trying to apply a ton of methods towards this goal (like AI debate and other judgment improving ideas)?

At least the people who agree with Constitutional AI should be in favour of this...?

Can anyone kindly explain what am I missing? I wrote a post [LW · GW] and I think almost nobody agreed with this argument.

Thanks :)

comment by Knight Lee (Max Lee) · 2024-12-22T02:35:40.983Z · LW(p) · GW(p)

It's important to remember that o3's score on the ARC-AGI is "tuned" while previous AI's scores are not "tuned." Being explicitly trained on example test questions gives it a major advantage.

According to François Chollet (ARC-AGI designer):

Note on "tuned": OpenAI shared they trained the o3 we tested on 75% of the Public Training set. They have not shared more details. We have not yet tested the ARC-untrained model to understand how much of the performance is due to ARC-AGI data.

It's interesting that OpenAI did not test how well o3 would have done before it was "tuned."

EDIT: People at OpenAI deny "fine-tuning" o3 for the ARC (see this comment [LW · GW] by Zach Stein-Perlman). But to me, the denials sound like "we didn't use a separate derivative of o3 (that's fine-tuned for just the test) to take the test, but we may have still done reinforcement learning on the public training set." (See my reply [LW · GW])

comment by Knight Lee (Max Lee) · 2024-12-23T10:37:01.618Z · LW(p) · GW(p)

My post [LW · GW] contains a spoiler alert so I'll hide the spoiler in this quick take. Please don't upvote this quicktake otherwise people might see it without seeing the post.

Spoiler alert: Ayn Rand wrote "Anthem," a dystopian novel where people were sentenced to death for saying "I."

Replies from: Max Lee

↑ comment by Knight Lee (Max Lee) · 2024-12-23T10:39:22.985Z · LW(p) · GW(p)

people were sentenced to death for saying "I."

Replies from: T3t

↑ comment by RobertM (T3t) · 2024-12-24T01:30:36.342Z · LW(p) · GW(p)

FYI: we have spoiler blocks [? · GW].

Replies from: Max Lee

↑ comment by Knight Lee (Max Lee) · 2024-12-24T02:49:19.978Z · LW(p) · GW(p)

Thank you for the help :)

By the way, how did you find this message? I thought I already edited the post to use spoiler blocks, and I hid this message by clicking "remove from Frontpage" and "retract comment" (after someone else informed me using a PM).

EDIT: dang it I still see this comment despite removing it from the Frontpage. It's confusing.

Replies from: T3t

↑ comment by RobertM (T3t) · 2024-12-24T05:09:49.663Z · LW(p) · GW(p)

The /allPosts page shows all quick takes/shortforms posted, though somewhat de-emphasized.

comment by Knight Lee (Max Lee) · 2025-04-18T07:48:05.956Z · LW(p) · GW(p)

comment by Knight Lee (Max Lee) · 2025-03-12T09:20:46.956Z · LW(p) · GW(p)

Knight Lee's Shortform

Contents

21 comments

An idea for preventing automated militaries from waging accidental war

Why not pay for the password in advance?

Latency

$10 billion lying around

Trust