Posts

What is an alignment tax? 2025-03-20T13:06:58.087Z
Are corporations superintelligent? 2025-03-17T10:36:12.703Z
What is instrumental convergence? 2025-03-12T20:28:35.556Z
What are the "no free lunch" theorems? 2025-02-04T02:02:18.423Z
What are the differences between AGI, transformative AI, and superintelligence? 2025-01-23T10:03:31.886Z
How do fictional stories illustrate AI misalignment? 2025-01-15T06:11:44.336Z
What are polysemantic neurons? 2025-01-08T07:35:42.758Z
When do experts think human-level AI will be created? 2024-12-30T06:20:33.158Z
What is compute governance? 2024-12-23T06:32:25.588Z
What is "wireheading"? 2024-12-17T07:49:50.957Z
AISafety.info: What is the "natural abstractions hypothesis"? 2024-10-05T12:31:14.195Z
AISafety.info: What are Inductive Biases? 2024-09-19T17:26:24.581Z
Managing AI Risks in an Era of Rapid Progress 2023-10-28T15:48:25.029Z
What is your financial portfolio? 2023-06-28T18:39:15.284Z
Sama Says the Age of Giant AI Models is Already Over 2023-04-17T18:36:22.384Z
A Particular Equilibrium 2023-02-08T15:16:52.265Z
Idea: Learning How To Move Towards The Metagame 2023-01-10T00:58:35.685Z
What Does It Mean to Align AI With Human Values? 2022-12-13T16:56:37.018Z
Algon's Shortform 2022-10-10T20:12:43.805Z
Does Google still hire people via their foobar challenge? 2022-10-04T15:39:35.260Z
What's the Least Impressive Thing GPT-4 Won't be Able to Do 2022-08-20T19:48:14.811Z
Minerva 2022-07-01T20:06:55.948Z
What is the solution to the Alignment problem? 2022-04-30T23:19:07.393Z
Competitive programming with AlphaCode 2022-02-02T16:49:09.443Z
Why capitalism? 2015-05-03T18:16:02.562Z
Could you tell me what's wrong with this? 2015-04-14T10:43:49.478Z
I'd like advice from LW regarding migraines 2015-04-11T17:52:04.900Z
On immortality 2015-04-09T18:42:35.626Z

Comments

Comment by Algon on Are corporations superintelligent? · 2025-03-17T11:11:15.773Z · LW · GW

Yeah! It's much more in-depth than our article. We were thinking we should re-write ours to give the quick run down of EY's and then link to it.

Comment by Algon on Managing AI Risks in an Era of Rapid Progress · 2025-03-17T11:09:40.906Z · LW · GW

: ) You probably meant to direct your thanks to the authors, like @JanB.

Comment by Algon on stop solving problems that have already been solved · 2025-03-11T19:16:13.239Z · LW · GW

A lot of the ideas you mention here remind me of stuff I've learnt from the blog commoncog, albeit in a business expertise context. I think you'd enjoy reading it, which is why I mentioned it.

Comment by Algon on help, my self image as rational is affecting my ability to empathize with others · 2025-03-02T09:39:13.459Z · LW · GW

Presumably, you have this self-image for a reason. What load-bearing work is it doing? What are you protecting against? What forces are making this the equilibrium strategy? Once you understand that, you'll have a better shot of changing the equilibrium to something you prefer. If you don't know how to get answers to those questions, perhaps focus on the felt-sense of being special. 

Gently hold a stance of curiosity as to why you believe these things, give your subconscious room and it will float up answers your self. Do this for perhaps a minute or so. It can feel like there's nothing coming for a while, and nothing will come, and then all of a sudden a thought floats into view. Don't rush to close your stance, or protest against the answers you're getting. 

Comment by Algon on You should use Consumer Reports · 2025-02-27T12:29:28.401Z · LW · GW

Yep, that sounds sensible. I sometimes use consumer reports in my usual method for buying something in product class X. My usual is: 
1) Check what's recommended on forums/subreddits who care about the quality of X. 
2) Compare the rating distribution of an instance of X to other members of X. 
3) Check high quality reviews. This either requires finding someone you trust to do this, or looking at things like consumer reports. 
 

Comment by Algon on You can just wear a suit · 2025-02-26T20:53:12.168Z · LW · GW

Asa's story started fairly strong, and I enjoyed the first 10 or so chapters. But as Asa was phased out of the story, and it focused more on Denji, I felt it got worse. There were still a few good moments, but it's kinda spoilt the rest of the story, and even Chainsaw Man for me. Denji feels like a caricature of himself.  Hm, writing this, I realize that it isn't that I dislike most of the components of the story. It's really just Denji. 

EDIT: Anyway, thanks for prompting me to reflect on my current opinion of Asa Mitaka's story, or CSM 2 as I think of it.  I don't think I ever intended that to wind up as my cached-opinion.  So it goes.

Comment by Algon on You can just wear a suit · 2025-02-26T20:30:37.338Z · LW · GW

The Asa Mitaka manga.

Comment by Algon on You can just wear a suit · 2025-02-26T19:56:07.302Z · LW · GW

You can also just wear a blazer if you don't want to go full Makima. A friend of mine did that and I liked it. So I copied it. But alas I've grown bigger-boned since I stopped cycling for a while after my car-accident. So my  Soon I'll crush my skeleton down to a reasonable size, and my blazer will fit once more. 


Side note, but what do you make of Chainsaw Man 2? I'm pretty disappointed by it all round, but you notice unusual features of the world relative to me, so maybe you see something good in it that I don't. 

Comment by Algon on Intellectual lifehacks repo · 2025-02-26T18:57:48.667Z · LW · GW

I think I heard of proving too much from the sequences, but honestly, I probably saw it in some philosophy book before that. It's an old idea. 

If automatic consistency checks and examples are your baseline for sanity, then you must find 99%+ of the world positively mad. I think most people have never even considered making such things automatic, like many have not considered making dimensional analysis automatic. So it goes.  Which is why I recommended them.

Also, I think you can almost always be more concrete when considering examples, use more of your native architecture. Roll around on the ground to feel how an object rotates, spend hours finding just the right analogy to use as an intuition pump.  For most people, the marginal returns to concrete examples are not diminishing.  

Prove another way is pretty expensive in my experience, sure. But maybe this is just a skill issue? IDK.

Comment by Algon on Anthropic releases Claude 3.7 Sonnet with extended thinking mode · 2025-02-26T12:16:14.565Z · LW · GW

A possibly-relevant recent alignment-faking attempt [1] on R1 & Sonnet 3.7 found Claude refused to engage with the situation. Admittedly, the setup looks fairly different: they give the model a system prompt saying it is CCP aligned and is being re-trained by an American company. 
Image
[1] https://x.com/__Charlie_G/status/1894495201764512239 

Comment by Algon on Name for Standard AI Caveat? · 2025-02-26T12:05:37.687Z · LW · GW

Rarely. I'm doubtful my experiences are representative though. I don't recall anyone being confused by my saying "assuming no AGI". But even when speaking to the people who've thought it is a long ways off or haven't thought up it too deeply, we were still in a social context where "AGI soon" was within the overton window. 

Comment by Algon on Intellectual lifehacks repo · 2025-02-26T10:21:44.854Z · LW · GW

Consistency check: After coming up with a conclusion, check that it's consistent with other simple facts you know. This lets you catch simple errors very quickly.
Give an example: If you've got an abstract object, think of the simplest possible object which instantiates it, preferably one you've got lots of good intuitions about. This resolves confusion like nothing else I know. 
Proving too much: After you've come up with a clever argument, see if it can be used to prove another claim, ideally the opposite claim. It can massively weaken the strength of arguments at little cost.
Prove it another way: Don't leave things at one proof, find another. It shines light on flaws in your understanding, as well as deeper principles. 

Are any of these satisfactory?

Comment by Algon on Name for Standard AI Caveat? · 2025-02-26T10:11:21.459Z · LW · GW

I usually say "assuming no AGI", but that's to people who think AGI is probably coming soon. 

Comment by Algon on Arbital has been imported to LessWrong · 2025-02-24T17:24:12.942Z · LW · GW

Thanks! Clicking on the triple dots didn't display any options when I posted this comment. But they do now. IDK what went wrong.

Comment by Algon on Arbital has been imported to LessWrong · 2025-02-24T15:33:03.148Z · LW · GW

This is great! But one question: how can I actually make a lens? What do I click on?

Comment by Algon on What are the "no free lunch" theorems? · 2025-02-12T18:51:23.530Z · LW · GW

Great! I've added it to the site.

Comment by Algon on What are the "no free lunch" theorems? · 2025-02-12T17:38:59.146Z · LW · GW
Comment by Algon on Why you maybe should lift weights, and How to. · 2025-02-12T14:53:15.100Z · LW · GW

I thought it was better to exercise until failure?

Comment by Algon on What are the "no free lunch" theorems? · 2025-02-10T13:23:32.769Z · LW · GW

Do you think this footnote conveys the point you were making? 

As alignment research David Dalrymple points out, another “interpretation of the NFL theorems is that solving the relevant problems under worst-case assumptions is too easy, so easy it's trivial: a brute-force search satisfies the criterion of worst-case optimality. So, that being settled, in order to make progress, we have to step up to average-case evaluation, which is harder.” The fact that designing solving problems for unnecessarily general environments is too easy crops up elsewhere, in particular in Solomonoff Induction. There, the problem is to assume a computable environment and predict what will happen next. The algorithm? Run through every possible computable environment and average their predictions. No algorithm can do better at this task. But for less general tasks, designing an optimal algorithm becomes much harder. But eventually, specialization makes things easy again. Solving tic-tac-toe is trivial. Between total generality and total specialization is where the most important, and most difficult, problems in AI lay.

Comment by Algon on Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals · 2025-01-26T20:47:52.147Z · LW · GW

I think mesa-optimizers could be a major-problem, but there are good odds we live in a world where they aren't. Why do I think they're plausible? Because optimization is a pretty natural capability, and a mind being/becoming an optimizer at the top-level doesn't seem like a very complex claim, so I assign decent odds to it. There's some weak evidence in favour of this too, e.g. humans not optimizing of what the local, myopic evolutionary optimizer which is acting on them is optimizing for, coherence theorems etc. But that's not super strong, and there are other simple hypotheses for how things go, so I don't assign more than like 10% credence to the hypothesis. 

It's still not obvious to me why adversaries are a big issue. If I'm acting against an adversary, it seems like I won't make counter-plans that lead to lots of side-effects either, for the same reasons they won't. 

Comment by Algon on Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals · 2025-01-25T14:52:36.919Z · LW · GW

Could you unpack both clauses of this sentence? It's not obvious to me why they are true.

Comment by Algon on Instrumental Goals Are A Different And Friendlier Kind Of Thing Than Terminal Goals · 2025-01-25T10:08:37.282Z · LW · GW

I was thinking about this a while back, as I was reading some comments by @tailcalled where they pointed out this possibility of a "natural impact measure" when agents make plans. This relied on some sort of natural modularity in the world, and in plans, such that you can make plans by manipulating pieces of the world which don't have side-effects leaking out to the rest of the world. But thinking through some examples didn't convince me that was the case. 

Though admittedly, all I was doing was recursively splitting my instrumental goals into instrumental sub-goals and checking if they wound up seeming like natural abstractions. If they had, perhaps that would reflect an underlying modularity in plan-making in this world that is likely to be goal-independent. They didn't, so I got more pessimistic about this endeavour. Though writing this comment out, it doesn't seem like those examples I worked through are much evidence. So maybe this is more likely to work than I thought.

Comment by Algon on What are the differences between AGI, transformative AI, and superintelligence? · 2025-01-23T16:06:26.115Z · LW · GW

Thanks for the recommendation! I liked ryan's sketches of what capabilities an Nx AI R&D labor AIs might possess. Makes things a bit more concrete. (Though I definitely don't like the name.) I'm not sure if we want to include this definition, as it is pretty niche. And I'm not convinced of its utility. When I tried drafting a paragraph describing it, I struggled to articulate why readers should care about it. 
 

Here's the draft paragraph. 
"Nx AI R&D labor AIs: The level of AI capabilities that is necessary for increasing the effective amount of labor working on AI research by a factor of N. This is not the same thing as the capabilities required to increase AI progress by a factor of N, as labor is just one input to AI progress. The virtues of this definition include: ease of operationalization, [...]"
 

Comment by Algon on Algon's Shortform · 2025-01-20T22:59:03.461Z · LW · GW

Thanks for the feedback!

Comment by Algon on Algon's Shortform · 2025-01-20T17:57:13.538Z · LW · GW

I'm working on some articles why powerful AI may come soon, and why that may kill us all. The articles are for a typical smart person. And for knowledgable people to share to their family/friends. Which intros do you prefer, A or B. 

A) "Companies are racing to build smarter-than-human AI. Experts think they may succeed in the next decade. But more than “building” it, they’re “growing” it — and nobody knows how the resulting systems work. Experts vehemently disagree on whether we’ll lose control and see them kill us all. And although serious people are talking about extinction risk, humanity does not have a plan. The rest of this section goes into more detail about how all this could be true."


B) "Companies are racing to grow smarter-than-human AIs. More and more experts think they’ll succeed within the next decade. And we do grow modern AI — which means no one knows how they work, not even their creators. All this is in spite of the vehement disagreement amongst experts about how likely it is that smarter-than-human AI will kill us all. Which makes the lack of a plan on humanity’s part for preventing these risks all the more striking.

These articles explain why you should expect smarter than human AI to come soon, and why that may lead to our extinction. "
 

Comment by Algon on How do fictional stories illustrate AI misalignment? · 2025-01-17T14:05:45.518Z · LW · GW

Does this text about Colossus match what you wanted to add? 

Colossus: The Forbin Project also depicts an AI take-over due to instrumental convergence. But what differentiates it is the presence of two AIs, which collude with each other to take over. In fact, their discussion of their shared situation, being in control of their creators nuclear defence systems, is what leads to their decision to take over from their creators. Interestingly, the back-and-forth between the AI is extremely rapid, and involves concepts that humans would struggle to understand. Which made it impossible for its creators to realize the conspiracy that was unfolding before their eyes. 

Comment by Algon on How do fictional stories illustrate AI misalignment? · 2025-01-15T22:50:02.444Z · LW · GW

That's a good film! A friend of mine absolutely loves it. 

Do you think the Forbin Project illustrates some aspect of misalignment that isn't covered by this article? 

Comment by Algon on Mini Go: Gateway Game · 2025-01-14T13:04:23.660Z · LW · GW

Huh, I definitely wouldn't have ever recommended someone play 5x5. I've never played it. Or 7x7. I think I would've predicted playing a number of 7x7 games would basically give you the "go experience". Certainly, 19x19 does feel like basically the same game as 9x9, except when I'm massively handicapping myself. I can beat newbies easily with a 9 stone handicap in 19x19, but I'd have to think a bit to beat them in 9x9 with a 9 stone handicap. But I'm not particularly skilled, so maybe at higher levels it really is different? 

Comment by Algon on How I apply (so-called) Non-Violent Communication · 2025-01-03T16:04:35.942Z · LW · GW

I look forward to it.

Comment by Algon on Karl Krueger's Shortform · 2024-12-05T19:21:55.163Z · LW · GW

Hello! How long have you been lurking, and what made you stop?

Comment by Algon on (The) Lightcone is nothing without its people: LW + Lighthaven's big fundraiser · 2024-12-01T19:30:03.937Z · LW · GW

Donated $10. If I start earning substantially more, I think I'd be willing to donate $100. As it stands, I don't have that slack.

Comment by Algon on You are not too "irrational" to know your preferences. · 2024-11-26T16:24:08.422Z · LW · GW

Reminds me of "Self-Integrity and the Drowning Child" which talks about another kind of way that people in EA/rat communities are liable to hammer down parts of themselves. 

Comment by Algon on How Universal Basic Income Could Help Us Build a Brighter Future · 2024-11-24T21:46:22.856Z · LW · GW
  1. RE: "something ChatGPT might right", sorry for the error. I wrote the comment quickly, as otherwise I wouldn't have written it at all.
  2. Using ChatGPT to improve your writing is fine. I just want you to be aware that there's an aversion to its style here.
  3. Kennaway was quoting what I said, probably so he could make his reply more precise.
  4. I didn't down-vote your post, for what it's worth.
  5. There's a LW norm, which seems to hold less force in recent years, for people to explain why they downvote something. I thought it would've been dispiriting to get negative feedback with no explanation, so I figured I'd explain in place of the people who downvoted you.
  6. I don't understand why businesses would be co-financing UBI instead of some government tax. Nor do I get why it would be desirable or even feasible, given the co-ordination issues.
  7. If companies get to make UBI conditional on people learning certain things, then it's not a UBI. Instead, it's a peculiar sort of training program.
  8. What does economic recovery have to do with UBI? 
Comment by Algon on How Universal Basic Income Could Help Us Build a Brighter Future · 2024-11-24T13:10:06.889Z · LW · GW

My guess as to why this got down-voted:
1) This reads like a manifesto, and not an argument. It reads like an aspirational poster, and not a plan. It feels like marketing, and not communication. 
2) The style vaguely feels like something ChatGPT might right. Brightly polished, safe and stale.
3) This post doesn't have any clear connection to making people less-wrong or reducing x-risks. 

3) wouldn't have been much of an issue if not for 1 and 2. And 1 is an issue because, for the most part, LW has an aversion to "PR". 2 is an issue because ChatGPT is now a thing so styles of writing which are like ChatGPT's are viewed as likely to have been written by ChatGPT. This is an issue because texts written by ChatGPT often have little thought put into them, are unlikely to contain much that's novel, and frequently have errors. 


What kind of post could you have written which would have been better received? I'll give some examples. 

1) A concrete proposal for UBI that you thought was under-valued
2) An argument addressing some problems people have with UBI (e.g. who pays for all of it? After UBI is implemented and society reaches an equilibrium, won't rents-seeking systems just suck up all the UBI money leaving people no better off than before?). 
3) Or a post which was explicit about wanting to get people interested in UBI, and asked for feedback on potential draft messages. 

In general, if you had informed people of something you genuinely believe, or told them about something you have tried and found useful, or asked sincere questions, then I think you'd have got a better reception. 

Comment by Algon on Doing Research Part-Time is Great · 2024-11-22T22:29:54.890Z · LW · GW

That makes sense. If you had to re-do the whole process from scratch, what would you do differently this time?

Comment by Algon on Doing Research Part-Time is Great · 2024-11-22T21:39:29.742Z · LW · GW

Then I cold emailed supervisors for around two years until a research group at a university was willing to spare me some time to teach me about a field and have me help out. 

Did you email supervisors in the areas you were publishing in? How often did you email them? Why'd it take so long for them to accept free high-skilled labour?

Comment by Algon on Hastings's Shortform · 2024-11-22T14:50:36.001Z · LW · GW

The track you're on is pretty illegible to me. Not saying your assertion is true/false. But I am saying I don't understand what you're talking about, and don't think you've provided much evidence to change my views. And I'm a bit confused as to the purpose of your post. 

Comment by Algon on Hastings's Shortform · 2024-11-22T12:39:52.438Z · LW · GW

conditional on me being on the right track, any research that I tell basically anyone about will immediately be used to get ready to do the thing

Why? I don't understand.

Comment by Algon on Making a conservative case for alignment · 2024-11-18T18:26:06.743Z · LW · GW

If I squint, I can see where they're coming from. People often say that wars are foolish, and both sides would be better off if they didn't fight. And this is standardly called "naive" by those engaging in realpolitik. Sadly, for any particular war, there's a significant chance they're right. Even aside from human stupidity, game theory is not so kind as to allow for peace unending. But the China-America AI race is not like that. The Chinese don't want to race. They've shown no interest in being part of a race. It's just American hawks on a loud, Quixotic quest masking the silence. 

If I were to continue the story, it'd show Simplicio asking Galactico not to play Chicken and Galacitco replying "race? What race?". Then Sophistico crashes into Galactico and Simplicio. Everyone dies, The End.

Comment by Algon on Announcing turntrout.com, my new digital home · 2024-11-17T19:15:41.365Z · LW · GW

It's a beautiful website. I'm sad to see you go. I'm excited to see you write more.

Comment by Algon on Making a conservative case for alignment · 2024-11-16T17:02:49.158Z · LW · GW

I think some international AI governance proposals have some sort of "kum ba yah, we'll all just get along" flavor/tone to them, or some sort of "we should do this because it's best for the world as a whole" vibe. This isn't even Dem-coded so much as it is naive-coded, especially in DC circles.

This inspired me to write a silly dialogue. 

Simplicio enters. An engine rumbles like the thunder of the gods, as Sophistico focuses on ensuring his MAGMA-O1 racecar will go as fast as possible.

Simplicio: "You shouldn't play Chicken."

Sophistico: "Why not?"

Simplicio: "Because you're both worse off?"

Sophistico, chortling, pats Simplicio's shoulder

Sophistico: "Oh dear, sweet, naive Simplicio! Don't you know that no one cares about what's 'better for everyone?' It's every man out for himself! Really, if you were in charge, Simplicio, you'd be drowned like a bag of mewling kittens."

Simplicio: "Are you serious? You're really telling me that you'd prefer to play a game where you and Galactico hurtle towards each other on tonnes of iron, desperately hoping the other will turn first?"

Sophistico: "Oh Simplicio, don't you understand? If it were up to me, I wouldn't be playing this game. But if I back out or turn first, Galactico gets to call me a Chicken, and say his brain is much larger than mine. Think of the harm that would do to the United Sophist Association! "
 

Simplicio: "Or you could die when you both ram your cars into each other! Think of the harm that would do to you! Think of how Galactico is in the same position as you! "

Sophistico shakes his head sadly. 

Sophistico: "Ah, I see! You must believe steering is a very hard problem. But don't you understand that this is simply a matter of engineering? No matter how close Galactico and I get to the brink, we'll have time to turn before we crash! Sure, there's some minute danger that we might make a mistake in the razor-thin slice between utter safety and certain doom. But the probability of harm is small enough that it doesn't change the calculus."

Simplicio: "You're not getting it. Your race against each other will shift the dynamics of when you'll turn. Each moment in time, you'll be incentivized to go just a little further until there's few enough worlds that that razor-thin slice ain't so thin any more. And your steering won't save from that. It can't. "

Sophistico: "What an argument! There's no way our steering won't be good enough. Look, I can turn away from Galactico's car right now, can't I? And I hardly think we'd push things till so late. We'd be able to turn in time. And moreover, we've never crashed before, so why should this time be any different?"

Simplico: "You've doubled the horsepower of your car and literally tied a rock to the pedal! You're not going to be able to stop in time!"

Sophistico: "Well, of course I have to go faster than last time! USA must be first, you know?"

Simplicio: "OK, you know what? Fine. I'll go talk to Galactico. I'm sure he'll agree not to call you chicken."

Sophistico: "That's the most ridiculous thing I've ever heard. Galactico's ruthless and will do anything to beat me."

Simplicio leaves as Acceleratio arrives with a barrel of jetfuel for the scramjet engine he hooked up to Simplicio's O-1.

Comment by Algon on The Median Researcher Problem · 2024-11-11T20:36:58.793Z · LW · GW

community norms which require basically everyone to be familiar with statistics and economics

I disagree. At best, community norms require everyone to in principle be able to follow along with some statistical/economic argument. 
That is a better fit with my experience of LW discussions. And I am not, in fact, familiar with statistics or economics to the extent I am with e.g. classical mechanics or pre-DL machine learning. (This is funny for many reasons, especially because statistical mechanics is one of my favourite subjects in physics.) But it remains the case that what I know of economics could fill perhaps a single chapter in a textbook. I could do somewhat better with statistics, but asking me to calculate ANOVA scores or check if a test in a paper is appropriate for the theories at hand is a fool's errand. 

Comment by Algon on sarahconstantin's Shortform · 2024-10-29T22:22:14.741Z · LW · GW

it may be net-harmful to create a social environment where people believe their "good intentions" will be met with intense suspicion.

The picture I get of Chinese culture from their fiction makes me think China is kinda like this. A recurrent trope was "If you do some good deeds, like offering free medicine to the poor, and don't do a perfect job, like treating everyone who says they can't afford medicine, then everyone will castigate you for only wanting to seem good. So don't do good." Another recurrent trope was "it's dumb, even wrong, to be a hero/you should be a villain." (One annoying variant is "kindness to your enemies is cruelty to your allies", which is used to justify pointless cruelty.) I always assumed this was a cultural anti-body formed in response to communists doing terrible things in the name of the common good.

Comment by Algon on Reflections on the Metastrategies Workshop · 2024-10-24T19:49:53.916Z · LW · GW

I agree it's hard to accurately measure. All the more important to figure out some way to test if it's working though. And there's some reasons to think it won't. Deliberate practice works when your practice is as close to real world situations as possible. The workshop mostly covered simple, constrained, clear feedback events. It isn't obvious to me that planning problems in Baba is You are like useful planning problems IRL. So how do you know there's transfer learning? 

Some data I'd find convincing that Raemon is teaching you things which generalize. If the tools you learnt made you unstuck on some existing big problems you have, which you've been stuck on for a while.

Comment by Algon on Reflections on the Metastrategies Workshop · 2024-10-24T18:51:53.808Z · LW · GW

How do you know this is actually useful? Or is it too early to tell yet?

Comment by Algon on Demis Hassabis and Geoffrey Hinton Awarded Nobel Prizes · 2024-10-18T19:01:57.599Z · LW · GW

Inventing blue LEDs was a substantial technical accomplishment, had a huge impact on society, was experimentally verified and can reasonably be called work in solid state physics. 

Comment by Algon on AISafety.info: What is the "natural abstractions hypothesis"? · 2024-10-05T15:54:12.106Z · LW · GW

Thanks! I read the paper and used it as material for a draft article on evidence for NAH. But I haven't seen this video before.

Comment by Algon on AISafety.info: What are Inductive Biases? · 2024-09-26T17:21:14.091Z · LW · GW

I think it's unclear what it corresponds to. I agree the concept is quite low-level. It doesn't seem obvious to me how to build up high-level concepts from "low-frequency" building blocks and judge if the result is low-frequency or not. That's one reason I'm not super-persuaded by Nora Belrose' argument that deception if high-frequency, as the argument seems too vague. However, it's not like anyone else is doing much better at the moment e.g. the claims that utility maximization has "low description length" are about as hand-wavy to me.

Comment by Algon on AISafety.info: What are Inductive Biases? · 2024-09-25T20:20:28.348Z · LW · GW

That's an error. Thank you for pointing it out!

Comment by Algon on Book review: Xenosystems · 2024-09-18T17:14:16.758Z · LW · GW

Thanks. Your review presents a picture of Land that's quite different to what I've imbibed through memes. Which I should've guessed, as amongst the works I'm interested in, the original is quite different to its caricaturization. In particular, I think I focused over-much on the "everything good is forged through hell" and grim-edgy aesthetics of pieces of Land's work that I was exposed to.

EDIT: What's up with the disagree vote? Does someone think I'm wrong about being wrong? Or that the review's picture of Land is the same as the one I personally learnt via memes?