o-o

Posts
Comments

Posts

If the DoJ goes through with the Google breakup,where does Deepmind end up? 2024-10-12T05:06:50.996Z

Thoughts on Francois Chollet's belief that LLMs are far away from AGI? 2024-06-14T06:32:48.170Z

What happens to existing life sentences under LEV? 2024-06-09T17:49:39.804Z

Hot take: The AI safety movement is way too sectarian and this is greatly increasing p(doom) 2024-05-19T02:18:53.524Z

Supposing the 1bit LLM paper pans out 2024-02-29T05:31:24.158Z

OpenAI wants to raise 5-7 trillion 2024-02-09T16:15:00.421Z

O O's Shortform 2023-06-03T23:32:12.924Z

Comments

Comment by O O (o-o) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-20T18:32:12.880Z · LW · GW

I just think they’ll be easy to fool. For example, historically many companies would get political favors (tariff exemptions) by making flashy fake headlines such as promising to spend trillions on factories.

Comment by O O (o-o) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-20T04:49:11.462Z · LW · GW

China has no specific animal welfare laws. There are also some Chinese that regard animal welfare as a Western import. Maybe the claim that they have no concept at all is too strong, but it's certainly minimized by previous regimes.
ie

Mao regarded the love for pets and the sympathy for the downtrodden as bourgeoise

And China's average meat consumption being lower could just be a reflection of their gdp per capita being lower. I don't know where you got the 14% vegetarian number. I can find 5% online. About the same as US numbers.

Comment by O O (o-o) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-20T02:07:27.598Z · LW · GW

I don't think the Trump admin has the capacity to meaningful take over an AGI project. Whatever happens, I think the lab leadership will be calling the shots.

Comment by O O (o-o) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-20T01:39:57.061Z · LW · GW

Looked up a poll from 2023. Though, maybe that poll is biased by people not voicing their true opinions?

Comment by O O (o-o) on Why Should I Assume CCP AGI is Worse Than USG AGI? · 2025-04-19T18:55:27.263Z · LW · GW

Chinese culture is just less sympathetic in general. China practically has no concept of philanthropy, animal welfare. They are also pretty explicitly ethnonationalist. You don’t hear about these things because the Chinese government has banned dissent and walled off its inhabitants.

However, I think the Hong Kong reunification is going better than I'd expect given the 2019 protests. You'd expect mass social upheaval, but people are just either satisfied or moderately dissatisfied.

Comment by O O (o-o) on johnswentworth's Shortform · 2025-04-17T07:38:13.362Z · LW · GW

quantum computing, nuclear fusion

Comment by O O (o-o) on Zach Stein-Perlman's Shortform · 2025-04-14T16:27:31.144Z · LW · GW

I recently interviewed with them, and one of them said they’re hiring a lot of SWEs as they shift to product. Also many of my friends are currently interviewing with them.

Comment by O O (o-o) on Zach Stein-Perlman's Shortform · 2025-04-14T04:07:21.788Z · LW · GW

I mean some hard evidence is them currently hiring a lot of software engineers for random product-y things. If AGI was close, wouldn't they go all in on research and training?

Comment by O O (o-o) on Zach Stein-Perlman's Shortform · 2025-04-11T20:57:40.500Z · LW · GW

https://www.windowscentral.com/software-apps/sam-altman-ai-will-make-coders-10x-more-productive-not-replace-them

It sounds like they’re getting pretty bearish on capabilities tho

Comment by O O (o-o) on LWLW's Shortform · 2025-04-08T04:50:02.385Z · LW · GW

Yes, the likely outcome of a long tariff regime is China replaces the U.S. as the hegemon + AI race leader and they can’t read Lesswrong or EA blogs there so all this work is useless.

Comment by O O (o-o) on Nathan Helm-Burger's Shortform · 2025-04-07T01:37:11.530Z · LW · GW

These tariffs may get rid of the compute disadvantage China faces (ie Taiwan starts to ignore export controls). We might see China being comfortably ahead in a year or two assuming we don’t see Congress take drastic action to eliminate the president’s tariffing powers.

Comment by O O (o-o) on O O's Shortform · 2025-04-06T04:23:44.184Z · LW · GW

The costs of capex go way up. It costs a lot more to build datacenters. It will cost a lot more to buy GPUs. It might cost less to buy energy? Lenders will be in poorer shape. AI companies will lose funding. I think it's already quite tenuous, given how little moat AI companies have. Costs are exploding and pretraining scaling seems too diminishing to be worth it. It's also not clear how AI labs will solve the reliability issue (at least to investors).

I also expect Taiwan to start ignoring export controls if our obscenely high tariffs on them remain.

Comment by O O (o-o) on O O's Shortform · 2025-04-06T01:38:14.972Z · LW · GW

The U.S. tariffs, if kept in place, will very likely cede the AI race to China. Has there been any writing on what a China leading race looks like?

Comment by O O (o-o) on METR: Measuring AI Ability to Complete Long Tasks · 2025-03-27T15:49:38.561Z · LW · GW

The point is the money or food just won’t get to them. How do you send food to a region in a civil war between 2 dictators?

Comment by O O (o-o) on METR: Measuring AI Ability to Complete Long Tasks · 2025-03-26T18:06:06.732Z · LW · GW

A lot of them are trapped in corrupt systems that are very costly and have ethics concerns blocking change. We have the money to feed them, but it would take far more money to turn a bunch of African countries into stable democracies. Overthrowing dictatorships might also raise ethics concerns about colonialism.

The easiest solution would just be lots of immigration, but host population reject that because of our evolutionary pecularities.

Comment by O O (o-o) on METR: Measuring AI Ability to Complete Long Tasks · 2025-03-23T22:27:09.677Z · LW · GW

Isn't it a distribution problem? World hunger has almost disappeared however. (The issue is hungrier nations have more kids, so progress is a bit hidden).

Comment by O O (o-o) on O O's Shortform · 2025-03-15T07:10:48.389Z · LW · GW

In a long AGI world, isn’t it very plausible that it gets developed in China and thus basically all efforts to shape its creation is pointless since Lesswrong and associated efforts don’t have much influence in China?

Comment by O O (o-o) on On the Rationality of Deterring ASI · 2025-03-05T23:38:52.783Z · LW · GW

I've always wondered, why didn't superpowers apply MAIM to nuclear capabilities in the past?

> Speculative but increasingly plausible, confidentiality-preserving AI verifiers

Such as?

Comment by O O (o-o) on My model of what is going on with LLMs · 2025-02-28T17:47:26.121Z · LW · GW

I mean, I don't want to give Big Labs any ideas, but I suspect the reasoning above implies that the o1/deepseek -style RL procedures might work a lot better if they can think internally for a long time

I expect gpt 5 to implement this. Based on recent research and how they phrase it.

Comment by O O (o-o) on O O's Shortform · 2025-02-23T18:16:50.574Z · LW · GW

I thought OpenAI’s deep research uses the full o3?

Comment by O O (o-o) on O O's Shortform · 2025-02-23T17:38:40.838Z · LW · GW

Why are current LLMs, reasoning models and whatever else still horribly unreliable? I can ask the current best models (o3, Claude, deep research, etc) to do a task to generate lots of code for me using a specific pattern or make a chart with company valuations and it’ll get them mostly wrong.

Is this just a result of labs hill climbing a bunch of impressive sounding benchmarks? I think this should delay timelines a bit. Unless there’s progress on reliability I just can’t perceive.

Comment by O O (o-o) on ≤10-year Timelines Remain Unlikely Despite DeepSeek and o3 · 2025-02-14T19:11:37.013Z · LW · GW

SWEs won't necessarily be fired even after becoming useless

I'm actually surprised at how eager/willing big tech is to fire SWEs once they're sure they won't be economically valuable. I think a lot of priors for them being stable come from the ZIRP era. Now, these companies have quite frequent layoffs, silent layoffs, and performance firings. Companies becoming leaner will be a good litmus test for a lot of these claims.

Comment by O O (o-o) on O O's Shortform · 2025-02-12T04:26:17.976Z · LW · GW

https://x.com/arankomatsuzaki/status/1889522974467957033?s=46&t=9y15MIfip4QAOskUiIhvgA

O3 gets IOI Gold. Either we are in a fast takeoff or the "gold" standard benchmarks are a lot less useful than imagined.

Comment by O O (o-o) on Wired on: "DOGE personnel with admin access to Federal Payment System" · 2025-02-07T06:22:32.385Z · LW · GW

I feel like a lot of manifold is virtue signaling .

Comment by O O (o-o) on Jesse Hoogland's Shortform · 2025-01-25T15:50:07.987Z · LW · GW

Just curious. How do you square the rise in AI stocks taking so long? Many people here thought it was obvious since 2022 and made a ton of money.

Comment by O O (o-o) on O O's Shortform · 2025-01-16T21:16:57.324Z · LW · GW

Keep in mind the current administration is replacing incompetent bureaucracies with self assembling corporations. The organization is still there, just more competent and under a different name. A government project could just look like telling the labs to create 1 data center, throwing money at them, and cutting red tape for building gas plants.

Comment by O O (o-o) on O O's Shortform · 2025-01-14T06:26:29.139Z · LW · GW

Seems increasingly likely to me that there will be some kind of national AI project before AGI is achieved as the government is waking up to its potential pretty fast. Unsure what the odds are, but last year, I would have said <10%. Now I think it's between 30% and 60%.

Has anyone done a write up on what the government-led AI project(s) scenario would look like?

Comment by O O (o-o) on RohanS's Shortform · 2025-01-05T07:54:27.888Z · LW · GW

It might just be a perception problem. LLMs don't really seem to have a good understanding of a letter being next to another one yet or what a diagonal is. If you look at arc-agi with o3, you see it doing worse as the grid gets larger with humans not having the same drawback.

EDIT: Tried on o1 pro right now. Doesn't seem like a perception problem, but it still could be. I wonder if it's related to being a succcesful agent. It might not model a sequence of actions on the state of a world properly yet. It's strange that this isn't unlocked with reasoning.

Comment by O O (o-o) on O O's Shortform · 2025-01-03T20:53:16.349Z · LW · GW

Ah so there could actually be a large compute overhang as it stands?

Comment by O O (o-o) on O O's Shortform · 2025-01-03T07:57:39.981Z · LW · GW

Does deepseek v3 imply current models are not trained as efficiently as they could be? Seems like they used a very small fraction of previous models resources and is only slightly worse than the best LLM.

Comment by O O (o-o) on o3 · 2024-12-23T18:17:40.272Z · LW · GW

They did this on the far easier training set though?

An alternative story is they trained until a model was found that could beat the training set but many other benchmarks too, implying that there may be some general intelligence factor there. Maybe this is still goodharting on benchmarks but there’s probably truly something there.

Comment by O O (o-o) on o3 · 2024-12-23T18:13:28.569Z · LW · GW

No, I believe there is a human in the loop for the above if that’s not clear.

You’ve said it in another comment. But this is probably an “architecture search”.

I guess the training loop for o3 is similar but it would be on the easier training set instead of the far harder test set.

Comment by O O (o-o) on o3 · 2024-12-23T08:29:54.874Z · LW · GW

I think there is a third explanation here. The Kaggle model (probably) does well because you can brute force it with a bag of heuristics and gradually iterate by discarding ones that don't work and keeping the ones that do.

Comment by O O (o-o) on My AI timelines · 2024-12-22T21:17:21.838Z · LW · GW

I have ~15% probability humanity will invent artificial superintelligence (ASI) by 2030.

The recent announcement of the o3 model has updated me to 95%, with most of the 5% being regulatory slow downs involving unprecedented global cooperation.

Comment by O O (o-o) on o3 · 2024-12-21T23:42:53.491Z · LW · GW

Do you have a link to these?

Comment by O O (o-o) on o3 · 2024-12-21T01:36:21.672Z · LW · GW

I think a lot of this is wishful thinking from safetyists who want AI development to stop. This may be reductionist but almost every pause historically can be explained economics.

Nuclear - war usage is wholly owned by the state and developed to its saturation point (i.e. once you have nukes that can kill all your enemies, there is little reason to develop them more). Energy-wise, supposedly, it was hamstrung by regulation, but in countries like China where development went unfettered, they are still not dominant. This tells me a lot it not being developed is it not being economical.

For bio related things, Eroom's law reigns supreme. It is just economically unviable to discover drugs in the way we do. Despite this, it's clear that bioweapons are regularly researched by government labs. The USG being so eager to fund gof research despite its bad optics should tell you as much.

Or maybe they will accidentally ban AI too due to being a dysfunctional autocracy -

I remember many essays from people all over this site on how China wouldn't be able to get to X-1 nm (or the crucial step for it) for decades, and China would always figure a way to get to that nm or step within a few months. They surpassed our chip lithography expectations for them. They are very competent. They are run by probably the most competent government bureaucracy in the world. I don't know what it is, but people keep underestimating China's progress. When they aim their efforts on a target, they almost always achieve it.

Rapid progress is a powerful attractor state that requires a global hegemon to stop. China is very keen on the possibilities of AI which is why they stop at nothing to get their hands on Nvidia GPUs. They also have literally no reason to develop a centralized project they are fully in control of. We have superhuman AI that seem quite easy to control already. What is stopping this centralized project on their end. No one is buying that even o3, which is nearly superhuman in math and coding, and probably lots of scientific research, is going to attempt world takeover.

Comment by O O (o-o) on o3 · 2024-12-20T23:24:48.301Z · LW · GW

We still somehow got the steam engine, electricity, cars, etc.

There is an element of international competition to it. If we slack here, China will probably raise armies of robots with unlimited firepower and take over the world. (They constantly show aggression)

The longshoreman strike is only allowed (I think) because the west coast did automate and somehow are less efficient than the east coast for example.

Comment by O O (o-o) on o3 · 2024-12-20T22:57:27.857Z · LW · GW

Oh I guess I was assuming automation of coding would result in a step change in research in every other domain. I know that coding is actually one of the biggest blockers in much of AI research and automation in general.

It might soon become cost effective to write bespoke solutions for a lot of labor jobs for example.

Comment by O O (o-o) on o3 · 2024-12-20T22:34:29.872Z · LW · GW

Why would that be the likely case? Are you sure it's likely or are you just catastrophizing?

Comment by O O (o-o) on o3 · 2024-12-20T22:29:29.094Z · LW · GW

catastrophic job loss that destroys the global economy?

I expect the US or Chinese government to take control of these systems sooner than later to maintain sovereignty. I also expect there will be some force to counteract the rapid nominal deflation that would happen if there was mass job loss. Every ultra rich person now relies on billions of people buying their products to give their companies the valuation they have.

I don't think people want nominal deflation even if it's real economic growth. This will result in massive printing from the fed that probably lands in poeple's pockets (Iike covid checks).

Comment by O O (o-o) on o3 · 2024-12-20T22:27:19.733Z · LW · GW

While I'm not surprised by the pessimism here, I am surprised at how much of it is focused on personal job loss. I thought there would be more existential dread.

Comment by O O (o-o) on O O's Shortform · 2024-12-09T05:06:16.858Z · LW · GW

It’s better at questions but subjectively there doesn’t feel like there’s much transfer. It still gets some basic questions wrong.

Comment by O O (o-o) on O O's Shortform · 2024-12-08T03:00:39.265Z · LW · GW

O1’s release has made me think Yann Lecun’s AGI timelines are probably more correct than shorter ones

Comment by O O (o-o) on Akash's Shortform · 2024-11-20T18:38:58.477Z · LW · GW

Why is the built-in assumption for almost every single post on this site that alignment is impossible and we need a 100 year international ban to survive? This does not seem particularly intellectually honest to me. It is very possible no international agreement is needed. Alignment may turn out to be quite tractable.

Comment by O O (o-o) on O O's Shortform · 2024-11-17T17:57:00.244Z · LW · GW

I guess in the real world the rules aren’t harder per se but just less clear and not written down. I think both the rules and tools needed to solve contest math questions at least feel harder than the vast majority of rules and tools human minds deal with. Someone like Terrence Tao, who is a master of these, excelled in every subject when he was a kid (iirc).

I think LLMs have a pretty good model of human behavior, so for anything related to human judgement, in theory this isn’t why it’s not doing well.

And where rules are unwritten/unknown (say biology), are the rules not at least captured by current methods? The next steps are probably like baking the intuitions of something like alphafold into something like o1. Whatever that means. R&D is what’s important and there is generally vast sums of data there.

Comment by O O (o-o) on O O's Shortform · 2024-11-17T05:36:38.259Z · LW · GW

O1 probably scales to superhuman reasoning:

O1 given maximal compute solves most AIME questions. (One of the hardest benchmarks in existence). If this isn’t gamed by having the solution somewhere in the corpus then:

-you can make the base model more efficient at thinking

-you can implement the base model more efficiently on hardware

-you can simply wait for hardware to get better

-you can create custom inference chips

Anything wrong with this view? I think agents are unlocked shortly along with or after this too.

Comment by O O (o-o) on O O's Shortform · 2024-11-10T05:57:58.531Z · LW · GW

Where are all the successful rationalists?

https://x.com/JDVance/status/1854925621425533043

Is it too soon to say a rationalist is running the White House?

Comment by O O (o-o) on O O's Shortform · 2024-10-24T01:40:09.231Z · LW · GW

https://x.com/arcprize/status/1849225898391933148?s=46&t=lZJAHzXMXI1MgQuyBgEhgA

My read of the events. Anthropic is trying to raise money and rushed out a half baked model.

3.5 opus has not yet had the desired results. 3.5 sonnet, being easier to iterate on, was tuned to beat OpenAI’s model on some arbitrary benchmarks in an effort to wow investors.

With the failed run of Opus, they presumably tried to get o1 like reasoning results or some agentic breakthrough. The previous 3.5s was also particularly good because of a fluke of the training run rng (same as gpt4-0314), which makes it harder for iterations to beat it.

They are probably now rushing to scale inference time compute. I wonder if they tried doing something with steering vectors initially for 3.5 opus.

Comment by O O (o-o) on O O's Shortform · 2024-10-04T22:40:48.909Z · LW · GW

A while ago I predicted that I think there's a more likely than not chance Anthropic would run out of money trying to compete with OpenAI, Meta, and Deepmind (60%). At the time and now, it seems they still have no image video or voice generation unlike the others, and do not process image as well in inputs either.

OpenAI's costs are reportedly at 8.5 billion. Despite being flush in cash from a recent funding round, they were allegedly at the brink of bankruptcy and required a new, even larger, funding round. Anthropic does not have the same deep pockets as the other players. Big tech like apple who are not deeply invested in AI seem to be wary of investing in OpenAI. It stands to reason, Amazon may be as well. It is looking more likely that Anthropic will be left in the dust (80%),

The only winning path I see is a new more compute efficient architecture emerges, they are first, and they manage to kick of RSI before more funded competitors rush in to copy them. Since this seems unlikely I think they are not going to fare well.

Comment by O O (o-o) on Ruby's Quick Takes · 2024-09-30T03:46:13.184Z · LW · GW

Really? He seems pretty bullish. He thinks it will co author math papers pretty soon. I think he just doesn’t think or at least state his thoughts on implications outside of math.

User info

Posts

Comments