Andrew Burns's Shortform

post by Andrew Burns (andrew-burns) · 2024-02-10T23:57:39.684Z · LW · GW · 14 comments

14 comments

Comments sorted by top scores.

comment by Andrew Burns (andrew-burns) · 2024-04-27T01:54:05.536Z · LW(p) · GW(p)

So the usual refrain from Zvi and others is that the specter of China beating us to the punch with AGI is not real because limits on compute, etc. I think Zvi has tempered his position on this in light of Meta's promise to release the weights of its 400B+ model. Now there is word that SenseTime just released a model that beats GPT-4 Turbo on various metrics. Of course, maybe Meta chooses not to release its big model, and maybe SenseTime is bluffing--I would point out though that Alibaba's Qwen model seems to do pretty okay in the arena...anyway, my point is that I don't think the "what if China" argument can be dismissed as quickly as some people on here seem to be ready to do.

Replies from: Seth Herd, ChristianKl
comment by Seth Herd · 2024-04-27T14:18:15.790Z · LW(p) · GW(p)

Are you saying that China will use Llama 3 400B weights as a basis for improving their research on LLMs? Or to make more tools from? Or to reach real AGI? Or what?

Replies from: andrew-burns
comment by Andrew Burns (andrew-burns) · 2024-04-27T15:26:10.954Z · LW(p) · GW(p)

Yes, yes. Probably not. And they already have a Sora clone called Vidu, for heaven's sake.

We spend all this time debating: should greedy companies be in control, should government intervene, will intervention slow progress to the good stuff: cancer cures, longevity, etc. All of these arguments assume that WE (which I read as a gloss for the West) will have some say in the use of AGI. If the PRC gets it, and it is as powerful as predicted, these arguments become academic. And this is not because the Chinese are malevolent. It's because, AGI would fall into the hands of the CCP via their civil-military fusion. This is a far more calculating group than those in Western governments. Here officials have to worry about getting through the next election. There, they can more comfortably wield AGI for their ends while worrying less about palatability of the means: observe how the population quietly endured a draconian lock-down and only meekly revolted when conditions began to deteriorate and containment looked futile.

I am not an accelerationist. But I am a get-it-before-them-ist. Whether the West (which I count as including Korea and Japan and Taiwan) can maintain our edge is an open question. A country that churns out PhDs and loves AI will not be easily thwarted.

Replies from: gwern, niplav
comment by gwern · 2024-04-30T00:35:49.958Z · LW(p) · GW(p)

And they already have a Sora clone called Vidu, for heaven's sake.

No, they don't. They have a video generation model, which is one of a great many published over the past few years as image generation increasingly became solved, such as Imagen Video or Phenaki from Google years ago, and the Vidu samples are clearly inferior to Sora (despite heavy emphasis on the 'pan over static scene' easy niche): https://www.youtube.com/watch?v=u1R-jxDPC70

Here we are in 2024, and we're still being told how Real Soon Now Chinese DL will crush Westerners. I've been hearing this for almost a decade now, and I've stopped being impressed by the likes of Hsu talking about how "China graduates a million engineers a year!" or whatever. Somehow, the Next Big Thing never comes out of Chinese DL, no matter how many papers or citations or patents they have each year. Something to think about.

(I also have an ongoing Twitter series where every half year or so, I tweet a few of the frontier-pushing Western DL achievements, and I ask for merely 3 Chinese things as good - not better, just plausibly as good, including in retrospect from previous years. You know how many actual legitimate answers I've gotten? Like 1. Somehow, all the e/accs and China hawks like Alexandr Wang can't seem to think of even a single one which was at or past the frontier, as opposed to the latest shiny 'catches up to GPT-4!* * [on narrow benchmarks, YMMV]' clone model.)

comment by niplav · 2024-04-27T21:29:31.617Z · LW(p) · GW(p)

The standard way of dealing with this:

Quantify how much worse the PRC getting AGI would be than OpenAI getting it, or the US government, and how much existential risk there is from not pausing/pausing, or from the PRC/OpenAI/the US government building AGI first, and then calculating whether pausing to do {alignment research, diplomacy, sabotage, espionage} is higher expected value than moving ahead.

(Is China getting AGI first half the value of the US getting it first, or 10%, or 90%?)

The discussion over pause or competition around AGI has been lacking this so far. Maybe I should write such an analysis.

Gentlemen, calculemus!

comment by ChristianKl · 2024-04-30T11:57:36.251Z · LW(p) · GW(p)

Isn't the main argument that Zvi makes that China is willing to do AI regulation and thus we can also do AI regulation.

In that frame the fact that Meta releases it's weights is just regulatory failure on our part. 

comment by Andrew Burns (andrew-burns) · 2024-02-10T23:57:39.779Z · LW(p) · GW(p)

When I was in middle school, our instructor was trying to teach us about the Bill of Rights. She handed out a paper copy and I immediately identified that Article the first (sic) and Article the second (sic) were not among the first ten amendments and that the numbers for the others were wrong. I boldly asserted that this wasn't the Bill of Rights and the teacher apologized and cursed the unreliable Internet. But I was wrong. This WAS the Bill of Rights, but the BILL rather than the ten ratified amendments. Everyone came away wrongly informed from that exchange.

Edit: I wrote before that I identified that they were not in the Constitution, but article the second is, as the 27th amendment, and I knew that, but it wasn't among the first ten.

comment by Andrew Burns (andrew-burns) · 2024-04-29T21:07:21.530Z · LW(p) · GW(p)

Anyone paying attention to the mystery of the GPT-2 chatbot that has appeared on lmsys? People are saying it operates at levels comparable to or exceeding GPT-4. I'm writing because I think the appearance of mysterious unannounced chatbots for public use without provenance makes me update my p(doom) upward.

Possibilities:

  1. this is a OpenAI chatbot based on GPT-4, just like it says it is. It has undergone some more tuning and maybe has boosted reasoning because of methods described in one of the more recently published papers

  2. this is another big American AI company masquarading OpenAI

  3. this is a big Chinese AI company masquerading as OpenAI

  4. this is an anonymous person or group who is using some GPT-4 fine tune API to improve performance

Possibility 1 seems most likely. If that is the case, I guess it is alright, assuming it is purely based on GPT-4 and isn't a new model. I suppose if they wanted to test on lmsys to gauge performance anonymously, they couldn't slap 4.5 on it, but they also couldn't ethically give it the name of another company's model. Giving it an entirely new name would invite heavy suspicion. So calling it the name of an old model and monitoring how it does in battle seems like the most ethical compromise. Still, even labeling a model with a different name feels deceptive.

Possibility 2 would be extremely unethical and I don't think it is the case. Also, the behavior of the model looks more like GPT-4 than another model. I expect lawsuits if this is the case.

Possibility 3 would be extremely unethical, but is possible. Maybe they trained a model on many GPT-4 responses and then did some other stuff. Stealing a model in this way would probably accelerate KYC legislation and yield outright bans on Chinese rental of compute. If this is the case, then there is no moat because we let our moat get stolen.

Possibility 4 is a something someone mentioned in Twitter. I don't know whether it is viable.

In any case, releasing models in disguise onto the Internet lowers my expectations for companies to behave responsibly and transparently. It feels a bit like Amazon and their scheme to collect logistics data from competitors by calling itself a different name. In that case, like this, the facade was paper thin...the headquarters of the fake company was right next to Amazon, but it worked for a long while. Since I think 1 is the mostly likely, I believe OpenAI wants to make sure it soundly beats everyone else in the rankings before releasing an update with improvements. But didn't they just release an update a few weeks ago? Hmm.

Replies from: whitehatStoic
comment by MiguelDev (whitehatStoic) · 2024-04-29T22:37:44.026Z · LW(p) · GW(p)

I'm not entirely sure if it's the same gpt2 model I'm experimenting with in the past year. If I get my hands on it, I will surely try to stretch its context window - and see if it exceeds 1024 tokens to test if its really gpt2.

Replies from: gwern
comment by gwern · 2024-04-30T00:41:05.570Z · LW(p) · GW(p)

It definitely exceeds 1024 BPEs context (we wouldn't be discussing it if it didn't, I don't think people even know how to write prompts that, combined with the system prompt etc, even fit in 1024 BPEs anymore), and it is almost certainly not GPT-2, come on.

Replies from: whitehatStoic
comment by MiguelDev (whitehatStoic) · 2024-04-30T02:27:56.097Z · LW(p) · GW(p)

Copy and pasting an entire paper/blog and asking the model to summarize it? - this isn't hard to do, and it's very easy to know if there is enough tokens, just run the text in any BPE tokenizer available online. 

Replies from: gwern
comment by gwern · 2024-04-30T13:48:31.904Z · LW(p) · GW(p)

Sure, the poem prompt I mentioned using is like 3500 characters all on its own, and it had no issues repeatedly revising and printing out 4 new iterations of the poem without apparently forgetting when I used up my quota yesterday, so that convo must've been several thousand BPEs.

Replies from: whitehatStoic
comment by MiguelDev (whitehatStoic) · 2024-04-30T13:51:00.127Z · LW(p) · GW(p)

Yeah, I saw your other replies in another thread and I was able to test it myself later today and yup it's most likely that it's OpenAI's new LLM. I'm just still confused why call such gpt2.

Replies from: gwern
comment by gwern · 2024-04-30T19:41:28.056Z · LW(p) · GW(p)

Altman made a Twitter-edit joke about 'gpt-2 i mean gpt2', so at this point, I think it's just a funny troll-name related to the 'v2 personality' which makes it a successor to the ChatGPT 'v1', presumably, 'personality'. See, it's gptv2 geddit not gpt-2? very funny, everyone lol at troll