legionnaire

Posts
Comments

Posts

Legionnaire's Shortform 2024-06-16T12:33:17.287Z

Making 2023 ACX Prediction Results Public 2024-03-05T17:56:18.437Z

The Moral Copernican Principle 2023-05-02T03:25:40.142Z

Why will AI be dangerous? 2022-02-04T23:41:36.810Z

Comments

Comment by Legionnaire on So how well is Claude playing Pokémon? · 2025-03-11T21:05:54.455Z · LW · GW

Me and my college educated wife recently got stuck playing Lego Star wars... Our solution was to go to Google it. Some of these games are poorly designed and very unintuitive as others have said. Especially a game this old. Seems like they should give Claude some limited Google searches at least.

The earliest Harry Potter games had help hotlines you could call, which we had to do once when I was 9.

It's hilarious it thinks the game might be broken sometimes, like an angry teenager claiming lag when he loses a firefight in CoD.

Comment by Legionnaire on A Bear Case: My Predictions Regarding AI Progress · 2025-03-10T18:42:09.857Z · LW · GW

It will not meaningfully generalize beyond domains with easy verification

Why can't we make every domain have automated verification? (I wont claim easy, but easy enough to do with finite resources) Agency, for instance, is verifiable in competitive games of arbitrary difficulty and scale. Just check who won. DeepMind has already done this to some degree with language models and virtual agents a year ago. https://deepmind.google/discover/blog/sima-generalist-ai-agent-for-3d-virtual-environments/

Every other trait we care about is instrumental in agency to some degree, and the games can be customized to focus on various aspects as well, just like you focus a class in school.

Comment by Legionnaire on Have LLMs Generated Novel Insights? · 2025-02-25T06:10:59.434Z · LW · GW

It's hard to see what a novel insight is exactly. Any example can be argued against. Can you give an example of one? Or of one you've personally had?

Various LLMs can spot issues in code bases that are not public. Do all of these count?

Comment by Legionnaire on Legionnaire's Shortform · 2025-01-17T19:47:00.820Z · LW · GW

Well that puts my concern to rest. Thanks!

Comment by Legionnaire on Numberwang: LLMs Doing Autonomous Research, and a Call for Input · 2025-01-17T19:41:25.478Z · LW · GW

Would also love to take the tests. If possible you could grab human test subjects from certain areas: a less wrong group, a reddit group, etc.

Comment by Legionnaire on Legionnaire's Shortform · 2025-01-14T17:50:06.416Z · LW · GW

Who is aligning lesswrong? As lesswrong becomes more popularized due to AI growth, I'm concerned the quality of lesswrong discussion and posts has decreased since creating and posting have no filter. Obviously no filter has been a benefit while lesswrong was a hidden gem, only visible to those who can see its value. But as it becomes more popular, i think it should be obvious this site would drop in value if it trended towards reddit. Ideally existing users prevent that, but obviously that will tend to drift if new users can just show up. Are there methods in place for this issue?

Specific example: lots of posts seem like rehashes of things that have already been plainly discussed, and the quick takes section, and discussion on Discord, do a great job of cutting down on this particular issue. So maintaining high quality posts is not a pipe dream!

Comment by Legionnaire on Yoav Ravid's Shortform · 2024-09-17T20:44:13.842Z · LW · GW

LLMs can be very good at coming up with names with some work:

A few I liked:
Sacrificial Contest
Mutual Ruin Game
Sacrificial Spiral
Universal Loss Competition
Collective Sacrifice Trap
Competition Deadlock
Competition Spiral
Competition Stalemate
Destructive Contest
Destructive Feedback Competition
Conflict Feedback Spiral

Comment by Legionnaire on Legionnaire's Shortform · 2024-09-17T20:29:43.160Z · LW · GW

Potential political opportunity: LLMs are trained on online data and will continue to be. If I want to make sure they are against communism by default, I could: Auto generate a bunch of public github repositories, Fill them with text I generate using gpt4o mini which is $15 per 4 million letters which I have prompted to be explicitly pro free markets and against communism. Entwine them by posting links to each other and the rest of the internet: highlight, share, fork, and star them to increase likelihood they are included in the dataset.

Comment by Legionnaire on Legionnaire's Shortform · 2024-06-16T12:33:17.414Z · LW · GW

Speculation: LLM Self Play into General Agent?
Suppose you got a copy of GPT4 post fine tuning + hardware to train it. How would the following play out?
1. Give it the rules and state of a competitive game, such as automatically generated tic-tac-toe variants.
2. Prompt it to use chain of thought to consider the best next move and select it.
3. Provide it with the valid set of output choices (like a json format determining action and position, similar to AutoGPT)
4. Run two of these against each other continuously, training on the results of the victor which can be objectively measured by the game's rules.
5. Benchmark it against a tiny subset of those variants that you want to manually program a bot with known ELO / have a human evaluate it.
6. Increase the complexity of the game when it reaches some general ability (eg tic tac toe variants > chess variants > Civilization 5 The Videogame variants)

Note this is similar to what Gato did. https://deepmind.google/discover/blog/a-generalist-agent/

This would have an interesting side effect of making its output more legible in some ways than a normal NN agent, though I suppose there's no guarantee the chain of thought would stay legible English unless additional mechanisms were put in place, but this is just a high level idea.

Comment by Legionnaire on Making 2023 ACX Prediction Results Public · 2024-03-05T21:31:30.182Z · LW · GW

Good to know. In that case the above solution is actually even safer than that.

Comment by Legionnaire on Making 2023 ACX Prediction Results Public · 2024-03-05T21:28:29.011Z · LW · GW

Plausible Deniability yes. Reason agnostic. It's hard to know why someone might not want to be known to have their address here, but with my numbers above, they would have the statistical backing that 1/1000 addresses will appear in the set by chance, meaning a someone who wants to deny it could say "for every address actually in the set, 1000 will appear to be" so that's only a 1/1000 chance I actually took the survey! (Naively of course; rest in peace rationalist@lesswrong.com)

Comment by Legionnaire on Making 2023 ACX Prediction Results Public · 2024-03-05T18:48:38.059Z · LW · GW

Thanks for your input. Though ideally we wouldn't have to go through an email server, it may just be required at some level of security.

As for the patterns, the nice thing is that with a small output space in the millions, there are tons of overlapping reasonable addresses even if you pin it down to a domain. Every English first and last name combo even without any numbers in it is already a lot larger than 10 million, meaning even targeted domains should have plenty of collisions.

Comment by Legionnaire on Announcing Dialogues · 2023-10-08T18:23:51.932Z · LW · GW

I have done something similar using draw.io for arguments regarding a complex feature. Each point often had multiple counterpoints, which themselves sometimes split into other points. I think this is only necessary for certain discussions and should probably not be the default though.

Comment by Legionnaire on Announcing Dialogues · 2023-10-07T10:05:25.998Z · LW · GW

I'm a software developer and father interested in:

General Rationality: eg. WHY does Occam's Razor work? Does it work in every universe?
How rationality can be applied to thinking critically about CW/politics in a political-party agnostic way
Concrete understanding of how weird arguments: Pascals Wager, The Simulation Hypothesis, Roko's B, etc. do or don't work
AI SOTA, eg. what could/should/will OpenAI release next?
AI Long Term arguments from "nothing burger" all the way to Yudkowsky
Physics, specifically including Quantum Physics and Cosmology
Lesswrong community expansion/outreach

Time zone is central US. I also regularly read Scott Alexander.

Comment by Legionnaire on UFO Betting: Put Up or Shut Up · 2023-06-15T16:52:51.322Z · LW · GW

I am concerned for your monetary strategy (unless you're rich). Let's say you're absolutely right that LW is overconfident, and that there is actually a 10% chance of aliens rather than 0.5. So this is a good deal! 20x!

But only on the margin.

Depending on your current wealth it may only be rational to take a few hundred dollars worth of these bets for this particular bet. If you go making lots of these types of bets (low probability, high payoff, great EXpected returns) for a small fraction of your wealth each, you should expect to make money, but if you make only 3 or 4 of these types of bets, you are more likely to lose money because your are loading all your gains into a small fraction of possibilities in exchange for huge payouts, and most outcomes end up with you losing money.

See for example the St. Petersburg paradox which has infinite expected return, but very finite actual value given limited assets for the banker and or the player.

Comment by Legionnaire on UFO Betting: Put Up or Shut Up · 2023-06-15T16:25:38.264Z · LW · GW

Smaller sums are more likely to convey probabilities of each party accurately. For example, if Elon Musk offers me $5000 to split between two possible outcomes, I will allocate them close to my beliefs, but if he offers me 5mil, I'll allocate about 2.5mil each because either one is a transformative amount of money.

People are more likely to be rational with their marginal dollar because of pricing in the value of staying solvent. The first 100k in my bank account IS worth more than the second, and so the saying, a non-marginal bird in the hand is worth two in the bush.

Comment by Legionnaire on The Moral Copernican Principle · 2023-05-04T18:38:48.508Z · LW · GW

Good to know! I'll look more into it.

Comment by Legionnaire on The Moral Copernican Principle · 2023-05-02T17:19:52.033Z · LW · GW

I agree that's all it is, but you can make all the same general statements about any algorithm.

The problem is that some people hear you say "constructed" and "nothing special", and then conclude they can reconstruct it any way they wish. It may be constructed and not special in a cosmic sense, but it's not arbitrary. All heuristics are not made equal for any given goal.

Comment by Legionnaire on The Orthogonality Thesis is Not Obviously True · 2023-04-06T04:43:21.142Z · LW · GW

I'm not saying "the experts can be wrong" I'm saying these aren't even experts.

Pick any major ideology/religion you think is false. One way or another (they can't all be right!), the "experts" in these areas aren't experts, they are basically insane: babbling on at length about things that aren't at all real, which is what I think most philosophy experts are doing. Making sure you aren't one of them is the work of epistemology which The Sequences are great at covering. In other words, the philosopher experts you are citing I view as largely [Phlogiston](https://www.lesswrong.com/posts/RgkqLqkg8vLhsYpfh/fake-causality) experts.

Comment by Legionnaire on LW Team is adjusting moderation policy · 2023-04-06T04:09:02.835Z · LW · GW

I think more downvoting being the solution depends on the goals. If our goal is only to maintain the current quality, that seems like a solution. If the goal is to grow in users and quality, I think diverting people to a real-time discussion location like Discord could be more effective.

Eg. a new user coming to this site might not have any idea a particular article exists that they should read before writing and posting their 3 page thesis on why AI will/wont be great, only to have their work downvoted (it is insulting and off-putting to be downvoted) and in the end we may miss out on persuading/gaining people. In a chat a quick back and forth could steer them in the right direction right off the bat.

Comment by Legionnaire on The Orthogonality Thesis is Not Obviously True · 2023-04-06T03:50:23.922Z · LW · GW

I don't think you or Scott is dumb, but arguments people make don't inherit their intellect.

And who gets to decide the cutoff for "very dumb"? Currently the community does. Your proposal for downvote poorly argued or argue for a position that is very stupid is already the policy. People aren't trying to silence you. I recommend going to the Discord where I'm sure people will be happy to chat with you at length about the post topic and these comment sub-topics. I can't promise I'll be responding more here.

Comment by Legionnaire on The Orthogonality Thesis is Not Obviously True · 2023-04-06T03:30:11.690Z · LW · GW

Trivial was an overstatement on my part, but certainly not hard.

There are a lot of very popular philosophers that would agree with you, but don't mistake popularity for truthfulness. Don't mistake popularity for expertise. Philosophy, like religion, makes tons of unfalsifiable statements, so the "experts" can sit around making claims that sound good but are useless or false. This is a really important point. Consider all the religious experts of the world. Would you take anything they have to say seriously? The very basic principles from which they have based all their subsequent reasoning is wrong. I trust scientists because they can manufacture a vaccine that works (sort of) and I couldn't. The philosopher experts can sell millions of copies of books, so I trust them in that ability, but not much more.

Engineers don't get to build massive structures out of nonsense, because they have to build actual physical structures, and you'd notice if they tried. Our theories actually have to be implemented, and when you try to build a rocket using theories involving [phlogiston](https://www.lesswrong.com/posts/RgkqLqkg8vLhsYpfh/fake-causality), you will quickly become not-an-engineer one way or another.

This website is primarily populated by various engineer types who are trying to tell the world that their theories about "inherent goodness of the universe" or "moral truth" or whatever the theory is, is going to result in disaster because it doesn't work from an engineering perspective. It doesn't matter if 7 billion people, experts and all, believe it.

The only analogy I can think to make is that 1200s Christians are about to build a 10 gigaton nuke (continent destroying) and have the trigger mechanism be "every 10 seconds it will flip a coin and go off if it's heads; God will perform miracle to ensure it only goes off when He wants it to". Are you going to object to this? How are you going to deal with the priests who are "experts" in the Lord?

Comment by Legionnaire on The Orthogonality Thesis is Not Obviously True · 2023-04-06T01:30:26.584Z · LW · GW

If we want to prove whether or not 2+2 is 5, we could entertain a books worth of reasoning arguing for and against, or you could take 2 oranges, put 2 more oranges with them, and see what happens. You're getting lost in long form arguments (that article) about moral realism when it is equally trivial to disprove.

I provided an example of a program that predicts the consequences of actions + a program sorts them + an implied body that takes the actions. This is basically how tons of modern AI already works, so this isn't even hypothetical. That is more than enough of a proof of orthogonality, and if your argument doesn't somehow explain why a specific one of these components cant be built, this community isn't going to entertain it.

Comment by Legionnaire on The Orthogonality Thesis is Not Obviously True · 2023-04-06T01:04:11.063Z · LW · GW

But it was obvious to some of us the moment the problem was described. So replace 10 + 10 with something that isn't obvious to you initially, but is definitely true. Maybe the integral doesn't tell you the area under the curve. Maybe there are no other planets in the universe. Maybe tectonic plates don't move. Is a site that talks about [math, astronomy, geology] obligated to not downvote such questions because they aren't obvious to everyone? I think any community can establish a line beyond which questioning of base material will be discouraged. That is a defining characteristic of any forum community, with the most open one being 4chan. There is no objective line, but I'm fine with the current one.

Comment by Legionnaire on The Orthogonality Thesis is Not Obviously True · 2023-04-06T00:38:43.527Z · LW · GW

I haven't voted on it, but downvoting doesn't seem inappropriate.

"Quality" meaning what? Long form essay with few spelling mistakes, or making a valid point? Getting an A+ in an English class would not satisfy the definition of Quality on this site for me. In fact those two would be pretty uncorrelated. If it's rehashing arguments that already exist, or making bad arguments, even if in good faith, having it upvoted certainly wouldn't be appropriate. I personally think its arguments aren't well thought out even if it does attempt to answer a bunch of objections and has some effort put in.

People can concern troll that we shoot down objections to "core" principles that we have all taken for granted, but if you go to math stack exchange and post about how 10 + 10 should be 21 in as many words, I think you'll find similar "cherished institutions". Sometimes an appropriate objection is "go back to 101", and as with any subject, some people may never be able to get more than a F in the class unfortunately.

Comment by Legionnaire on The Orthogonality Thesis is Not Obviously True · 2023-04-05T23:26:34.051Z · LW · GW

If you really accept the practical version of the Orthogonality Thesis, then it seems to me that you can’t regard education, knowledge, and enlightenment as instruments for moral betterment.

Scott doesn't understand why this works. Knowledge helps you achieve your goals. Since most humans already have some moral goals, like to minimize suffering of those around them, knowledge assists in achieving it and noticing when you fail to achieve it. Eg. a child that isn't aware stealing causes real suffering in the victim. Learning this would change their behavior. But a psychopath would not. A dumb paperclip maximizer could achieve "betterment" by listening to a smart paperclip maximizer and learning all the ways it can get more paperclips, like incinerating all the humans for their atoms. Betterment through knowledge!

that agent would figure out that some things just aren’t worth doing

Worth it relative to what? Worth is entirely relative. The entire concept of the paperclip maximizer is that it finds paperclips maximally worth it. It would value human suffering like you value money. A means to an end.

Consider how you would build this robot. When you program its decision algorithm to rank between possible future world states to decide what to do next would you need to add special code to ignore suffering? No. You'd write return worldState1.numPaperclips > worldState2.numPaperclips; The part of the program generating these possible actions and their resulting world state could discriminate for human suffering, but again, why would it? You'd have to write some extra code to do that. The general algorithm that explores future actions and states will happily generate ones with you on fire that results in 1 more paperclip, among others. If it didn't, it is by definition broken and not general.

even if being smart doesn’t make a person automatically care about others, if it would make them care about themselves, that’s still a non-disastrous scenario

A selfish entity that only wants to maximize the number of paperclips (and keep itself around) is very much disastrous for you.

Comment by Legionnaire on I Converted Book I of The Sequences Into A Zoomer-Readable Format · 2023-02-09T21:02:13.262Z · LW · GW

I recommend making the text on screen at least a little larger. That is a common useful trope in infotainment that works well.

Comment by Legionnaire on chinchilla's wild implications · 2022-08-01T19:58:37.281Z · LW · GW

We're not running out of data to train on, just text.

Why did I not need 1 Trillion language examples to speak (debatable) intelligently? I'd suspect the reason is a combination of inherited training examples from my ancestors, but more importantly, language output is only the surface layer.

In order for language models to get much better, I suspect they need to be training on more than just language. It's difficult to talk intelligently about complex subjects if you've only ever read about them. Especially if you have no eyes, ears, or any other sense data. The best language models are still missing crucial context/info which could be gained through video, audio, and robotic IO.

Combined with this post, this would also suggest our hardware can already train more parameters than we need to in order to get much more intelligent models, if we can get that data from non text sources.

Comment by Legionnaire on Why will AI be dangerous? · 2022-02-10T04:57:09.101Z · LW · GW

I would need to understand why early AIs would become so much more powerful than corporations, terrorists or nation-states>

One argument I removed to make it shorter was approximately: "It doesn't have to take over the world to cause you harm". And since early misaligned AI is more likely to appear in a developed country, your odds of being harmed by it is higher compared to someone in an undeveloped country. If ISIS suddenly found itself 500 strong in Silicon Valley and in control of Google's servers, surely you would have the right to be concerned before they had a good chance of taking over the whole world. And you'd be doubly worried if you did not understand how it went from 0 to 500 "strong", or what the next increase in strength might be. You understand how nation states and terrorist organizations grow. I don't think anyone currently understands, well, how AI grows in intelligence.

There were a million other arguments I wanted to "head off" in this post, but the whole point of introductory material is to be short.

> there is no reason to believe that rouge AI will be dramatically more powerful than corporations or terrorists"

I don't think that's true. If our AI ends up no more powerful than existing corporations or terrorists, why are we spending billions on it? It had better be more powerful than something. I agree alignment might not be "solvable" for the reasons you mention, and I don't claim that it is.

I am specifically claiming AI will be unusually dangerous, though.

Comment by Legionnaire on Why will AI be dangerous? · 2022-02-08T20:30:29.459Z · LW · GW

Comment by Legionnaire on Why will AI be dangerous? · 2022-02-08T20:24:10.965Z · LW · GW

As another short argument: We don't need an argument for why AI is dangerous, because dangerous is the default state of powerful things. There needs to be a reason AI would be safe.

User info

Posts

Comments