What are the "no free lunch" theorems?

vishakha

What are the "no free lunch" theorems?

post by Vishakha (vishakha-agrawal), Algon · 2025-02-04T02:02:18.423Z · LW · GW · 4 comments

This is a link post for https://aisafety.info/questions/8C7T/What-are-the-%22no-free-lunch%22-theorems

5 comments

This is an article in the featured articles series from AISafety.info. AISafety.info writes AI safety intro content. We'd appreciate any feedback.

The most up-to-date version of this article is on our website, along with 300+ other articles on AI existential safety.

“No free lunch” [LW · GW] theorems assert that, on average, every learning algorithm does equally well over all possible learning tasks. An algorithm that does better than chance at predicting some sequences must "pay for lunch" by doing worse at some other sequences.

Some have argued that these theorems imply that fully general intelligence is impossible, and therefore worries about AGI are overblown.

However, "no free lunch" holds only on the entire set of all theoretically possible sequences. The ones our algorithm does worse at may just be fully random, or designed to trick it. But if we start out knowing that the environment that our algorithm operates in has a certain structure, then the “no free lunch” results are not an impediment to designing algorithms with superior predictive or optimizing abilities.

Therefore, for practical AI design purposes, these theorems are often irrelevant. We aren't interested in "predicting" completely random sequences, and we don't mind if another algorithm outperforms us on that "task". No system can be so general as to perform well in every possible universe, but AGI is only required to perform well in one universe - ours. Our universe is far from maximally random, and its laws provide a lot of structure that lets us make good predictions while "paying for lunch" in other possible universes without that structure.

"No free lunch" hasn't prevented humans from exploiting the universe's structure for research and development, and won't prevent artificial systems from doing so much more effectively. The generality needed for AGI to exceed human abilities across the board is not the same kind of generality forbidden by these theorems.

4 comments

Comments sorted by top scores.

comment by Noosphere89 (sharmake-farah) · 2025-02-04T02:52:40.206Z · LW(p) · GW(p)

Another interpretation of the no free lunch theorem by @davidad [LW · GW] is that learning/optimization is too trivial under worst-case conditions, but also impractical, so you need to put more constraints to have an interesting solution:

https://www.lesswrong.com/posts/yTvBSFrXhZfL8vr5a/worst-case-thinking-in-ai-alignment#N3avtTM3ESH4KHmfN [LW(p) · GW(p)]

Replies from: Algon

↑ comment by Algon · 2025-02-10T13:23:32.769Z · LW(p) · GW(p)

Do you think this footnote conveys the point you were making?

As alignment research David Dalrymple points out, another “interpretation of the NFL theorems is that solving the relevant problems under worst-case assumptions is too easy, so easy it's trivial: a brute-force search satisfies the criterion of worst-case optimality. So, that being settled, in order to make progress, we have to step up to average-case evaluation, which is harder.” The fact that designing solving problems for unnecessarily general environments is too easy crops up elsewhere, in particular in Solomonoff Induction. There, the problem is to assume a computable environment and predict what will happen next. The algorithm? Run through every possible computable environment and average their predictions. No algorithm can do better at this task. But for less general tasks, designing an optimal algorithm becomes much harder. But eventually, specialization makes things easy again. Solving tic-tac-toe is trivial. Between total generality and total specialization is where the most important, and most difficult, problems in AI lay.

Replies from: sharmake-farah

↑ comment by Noosphere89 (sharmake-farah) · 2025-02-10T17:04:12.706Z · LW(p) · GW(p)

Yes, it does convey the point accurately, according to me.

Replies from: Algon, Algon

↑ comment by Algon · 2025-02-12T18:51:23.530Z · LW(p) · GW(p)

Great! I've added it to the site.

↑ comment by Algon · 2025-02-12T17:38:59.146Z · LW(p) · GW(p)

What are the "no free lunch" theorems?

Contents

4 comments