6 (Potential) Misconceptions about AI Intellectuals

ozziegooen

6 (Potential) Misconceptions about AI Intellectuals

post by ozziegooen · 2025-02-14T23:51:44.983Z · LW · GW · 11 comments

11 comments

11 comments

Comments sorted by top scores.

comment by William_S · 2025-02-16T21:23:50.693Z · LW(p) · GW(p)

Would be nice to have a llm+prompt that tries to produce reasonable AI strategy advice based on a summary of the current state of play, have some way to validate that it's reasonable, be able to see how it updates as events unfold.

Replies from: ozziegooen

↑ comment by ozziegooen · 2025-02-17T01:10:10.494Z · LW(p) · GW(p)

Agreed. I'm curious how to best do this.

One thing that I'm excited about is using future AIs to judge current ones. So we could have a system that does:
1. An AI today (or a human) would output a certain recommended strategy.
2. In 10 years, we agree to have the most highly-trusted AI evaluator evaluate how strong this strategy was, on some numeric scale. We could also wait until we have a "sufficient" AI, meaning that there might be some set point at which we'd trust AIs to do this evaluation. (I discussed this more here [LW · GW])
3. Going back to ~today, we have forecasting systems predict how well the strategy (1) will do on (2).

comment by William_S · 2025-02-16T21:20:39.647Z · LW(p) · GW(p)

A couple advantages for AI intellectuals could be:
- being able to rerun based on different inputs, see how their analysis changes function of those inputs
- being able to view full reasoning traces (while also not the full story, probably more of the full story than what goes on with human reasoning, good intellectuals already try to share their process but maybe can do better/use this to weed out clearly bad approaches)

Replies from: ozziegooen

↑ comment by ozziegooen · 2025-02-17T01:13:29.438Z · LW(p) · GW(p)

Yep!

On "rerun based on different inputs", this would work cleanly with AI forecasters. You can literally say, "Given that you get a news article announcing a major crisis X that happens tomorrow, what is your new probability on Y?" (I think I wrote about this a bit before, can't find it right now).

I did write more about a full-scale forecasting system could be built and evaluated, here, for those interested:
https://www.lesswrong.com/posts/QvFRAEsGv5fEhdH3Q/preliminary-notes-on-llm-forecasting-and-epistemics [LW · GW]
https://www.lesswrong.com/posts/QNfzCFhhGtH8xmMwK/enhancing-mathematical-modeling-with-llms-goals-challenges [LW · GW]

Overall, I think there's just a lot of neat stuff that could be done.

comment by PeterMcCluskey · 2025-02-16T17:31:30.647Z · LW(p) · GW(p)

It would certainly be valuable to have AIs that are more respected than Wikipedia as a source of knowledge.

I have some concerns about making AIs highly strategic. I see some risk that strategic abilities will be the last step in the development of AI that is powerful enough to take over the world. Therefore, pushing AI intellectuals to be strategic may speed up that risk.

I suggest aiming for AI intellectuals that are a bit more passive, but still authoritative enough to replace academia as the leading validators of knowledge.

Replies from: ozziegooen, ozziegooen

↑ comment by ozziegooen · 2025-02-16T20:07:46.072Z · LW(p) · GW(p)

"I see some risk that strategic abilities will be the last step in the development of AI that is powerful enough to take over the world."

Just fyi - I feel like this is similar to what others have said. Most recently, benwr had a post here: https://www.lesswrong.com/posts/5rMwWzRdWFtRdHeuE/not-all-capabilities-will-be-created-equal-focus-on?commentId=uGHZBZQvhzmFTrypr#uGHZBZQvhzmFTrypr [LW(p) · GW(p)]

Maybe we could call this something like "Strategic Determinism"

I think one more precise claim I could understand might be:
1. The main bottleneck to AI advancement is "strategic thinking"
2. There's a decent amount of uncertainty on when or if "strategic thinking" will be "solved"
3. Human actions might have a lot of influence over (2). Depending on what choices humans make, strategic thinking might be solved sooner or much later.
4. Shortly after "strategic thinking" is solved, we gain a lot of certainty on what future trajectory will be like. As in, the fate of humanity is sort of set by this point, and further human actions won't be able to change it much.
5. "Strategic thinking" will lead to a very large improvement in potential capabilities. One main reason is that it would lead to recursive self-improvement. If there is one firm that has sole access to an LLM with "strategic thinking", it is likely to develop a decisive strategic advantage.

I think personally, such a view seems too clean to me.
1. I expect that there will be a lot of time where LLMs get better at different aspects of strategic thinking, and this helps to limited extents.
2. I expect that better strategy will have limited gains in LLM capabilities, for some time. The strategy might suggest better LLM improvement directions, but these ideas won't actually help that much. Maybe a firm with a 10% better strategist would be able to improve it's effectiveness by 5% per year or something.
3. I think there are could be a bunch of worlds where we have "idiot savants" who are amazing at some narrow kinds of tasks (coding, finance), but have poor epistemics in many ways we really care about. These will make tons of money, despite being very stupid in important ways.
4. I expect that many of the important gains that would come from "great strategy" would be received in other ways, like narrow RL. A highly optimized-with-RL coding system wouldn't benefit that much with certain "strategy" benefits.
5. A lot of the challenges for things like "making a big codebase" aren't to do with "being a great strategist", but more with narrower problems like "how to store a bunch of context in memory" or "basic reasoning processes for architecture decisions specifically"

↑ comment by ozziegooen · 2025-02-16T19:50:05.511Z · LW(p) · GW(p)

Alexander Gordon-Brown challenged me on a similar question here:
https://www.facebook.com/ozzie.gooen/posts/pfbid02iTmn6SGxm4QCw7Esufq42vfuyah4LCVLbxywAPwKCXHUxdNPJZScGmuBpg3krmM3l

One thing I wrote there:

I didn't spend much time on the limitations of such intellectuals. For the use cases I'm imagining, it's fairly fine for them to be slow, fairly expensive (maybe it would cost $10/hr to chat with them), and not very great at any specific discipline. Maybe you could spend $10 to $100 and get the equivalent of one Scott Alexander essay, on any topic he could write about, for example.
I think that such a system could be pretty useful in certain AI agents, but I wouldn't expect it to be a silver bullet. I'm really unsure if it's the "missing link."
I expect that a lot of these systems would be somewhat ignored when it comes to using them to give humans a lot of high-level advice, similar to how prediction markets or econ experts get ignored.
It's tricky to understand the overlap between high-level reasoning as part of an AI coding tool-chain (where such systems would have clear economic value), and such reasoning in big-picture decision-making (where we might expect some of this to be ignored for a while). Maybe I'd expect that the narrow uses might be done equally well using more domain-specific optimizations. Like, reinforement learning on large codebases already does decently well on a lot of the "high-level strategy" necessary (though it doesn't think of it this way), and doesn't need some specialized "strategy" component.

I expect that over time we'll develop better notions about how to split up and categorize the skills that make up strategic work. I suspect some things will have a good risk-reward tradeoff and some won't.

I expect that people in the rationality community over-weight the importance of, well, rationality.

I suggest aiming for AI intellectuals that are a bit more passive, but still authoritative enough to replace academia as the leading validators of knowledge.

My main point with this topic is that I think our community should be taking this topic seriously, and that I expect there's a lot of good work that could be done that's tractable, valuable, and safe. I'm much less sure about exactly what that work is, and I definitely recommend that work here really try to maximize the reward/risk ratio.

Some quick heuristics that I assume would be good are:
- Having AIs be more correct about epistemics and moral reasoning on major global topics generally seems good. Ideally there are ways of getting that that don't require huge generic LLM gains.
- We could aim for expensive and slow systems.
- There might not be a need to publicize such work much outside of our community. (This is often hard to do anyway).
- There's a lot of work that would be good for people we generally trust, and alienate most others (or be less useful for other use cases). I think our community focuses much more on truth-seeking, Bayesian analysis, forecasting, etc.
- Try to quickly get the best available reasoning systems we might have access to, to be used to guide strategy on AI safety. In theory, this cluster can be ahead-of-the-curve.
- Great epistemic AI systems don't need much agency or power. We can heavily restrict them to be tool AIS.
- Obviously, if things seriously get powerful, there are a lot of various techniques that could be done (control, evals, etc) to move slowly and lean on the safe side.

Replies from: ozziegooen

↑ comment by ozziegooen · 2025-02-16T20:10:18.734Z · LW(p) · GW(p)

I'd lastly flag that I sort of addressed this basic claim in "Misconceptions 3 and 4" in this piece.

comment by habryka (habryka4) · 2025-02-15T03:46:51.748Z · LW(p) · GW(p)

While artificial intelligence has made impressive strides in specialized domains like coding, art, and medicine, I think its potential to automate high-level strategic thinking has been surprisingly underrated. I argue that developing "AI Intellectuals" - software systems capable of sophisticated strategic analysis and judgment - represents a significant opportunity that's currently being overlooked, both by the EA/rationality communities and by the public.

FWIW, this paragraph reads LLM generated to me (then I stopped reading because I have a huge prior that content that reads that LLM-edited is almost universally low-quality).

Replies from: ozziegooen

↑ comment by ozziegooen · 2025-02-15T05:21:45.435Z · LW(p) · GW(p)

Thanks for letting me know.

I spent a while writing the piece, then used an LLM to edit the sections, as I flagged in the intro.

I then spent some time re-editing it back to more of my voice, but only did so for some key parts.

I think that overall this made it more readable and I consider the sections to be fairly clear. But I agree that it does pattern-match on LLM outputs, so if you have a prior that work that sounds kind of like that is bad, you might skip this.

I obviously find that fairly frustrating and don’t myself use that strategy that much, but I could understand it.

I assume that bigger-picture, authors and readers could both benefit a lot from LLMs used in similar ways (can produce cleaner writing, easier), but I guess now we’re at an awkward point.

comment by ozziegooen · 2025-02-16T20:17:14.305Z · LW(p) · GW(p)

I'm obviously disappointed by the little attention here / downvotes. Feedback is appreciated.

Not sure if LessWrong members more disagree with the broad point for other reasons, or the post was seen as poorly written, or other.

6 (Potential) Misconceptions about AI Intellectuals

Contents

11 comments