Posts

National Telecommunications and Information Administration: AI Accountability Policy Request for Comment 2023-04-11T22:59:22.644Z
Cyberspace Administration of China: Draft of "Regulation for Generative Artificial Intelligence Services" is open for comments 2023-04-11T09:32:13.396Z
Large language models aren't trained enough 2023-03-29T00:56:13.925Z
Alpaca: A Strong Open-Source Instruction-Following Model 2023-03-14T02:41:23.384Z
Adversarial Policies Beat Professional-Level Go AIs 2022-11-03T13:27:00.059Z
DeepMind on Stratego, an imperfect information game 2022-10-24T05:57:39.462Z
Russia will do a nuclear test 2022-10-04T14:59:20.288Z
Double Asteroid Redirection Test succeeds 2022-09-27T06:37:17.816Z
Will Russia default, and if so, will it lead to financial crisis? 2022-03-09T06:34:00.567Z
Sphere packing and logical uncertainty 2016-04-25T06:02:57.685Z
Negative karma is a bad design 2012-12-13T11:27:35.739Z

Comments

Comment by sanxiyn on Elon files grave charges against OpenAI · 2024-03-02T04:47:03.835Z · LW · GW

No. Traditionally, donors have no standing to sue charity. From https://www.thetaxadviser.com/issues/2021/sep/donor-no-standing-sue-donor-advised-fund.html

California limits by statute the persons who can sue for mismanagement of a charitable corporation's assets. The court found that the claims raised by Pinkert for breach of a fiduciary duty for mismanagement of assets were claims for breach of a charitable trust. The court determined that under California law, a suit for breach of a charitable trust can be brought by the attorney general of California...

Comment by sanxiyn on The First Room-Temperature Ambient-Pressure Superconductor · 2023-07-27T09:01:36.077Z · LW · GW

The patent is not yet granted.

Comment by sanxiyn on The First Room-Temperature Ambient-Pressure Superconductor · 2023-07-27T08:58:40.481Z · LW · GW

Someone from South Korea is extremely skeptical and wrote a long thread going into paper's details why it must be 100% false: https://twitter.com/AK2MARU/status/1684435312557314048. Sorry it's in Korean, but we live in the age of miracle and serviceable machine translation.

Comment by sanxiyn on A brief history of computers · 2023-07-22T22:00:26.911Z · LW · GW

But it wasn't until the 1940s and the advent of the electronic computer that they actually built a machine that was used to construct mathematical tables. I'm confused...

You are confused because that is not the reality. As you can read on Wikipedia's entry on difference engine, Scheutz built a difference engine derivative, sold it, and it was used to create logarithmic tables.

You must have read this while writing this article. It is prominent in the Wikipedia article in question and hard to miss. Why did you make this mistake? If it was a deliberate act of misleading for narrative convenience I am very disappointed. Yes, the reality is rarely narratively convenient, but you shouldn't lie about it.

Comment by sanxiyn on [Linkpost] Introducing Superalignment · 2023-07-06T11:34:51.752Z · LW · GW

My median estimate has been 2028 (so 5 years). I first wrote down 2028 in 2016 (so 12 years after then), and during 7 years since, I barely moved the estimate. Things roughly happened when I expected them to.

Comment by sanxiyn on OpenAI introduces function calling for GPT-4 · 2023-06-20T03:17:54.279Z · LW · GW

I am curious how this fine-tuning for function calling was done, because it is user controllable. In the OpenAI API, if you pass none to function_call parameter, the model never calls a function. There seem to be one input bit and one output bit, for "you may want to call a function" and "I want to call a function".

Comment by sanxiyn on UK PM: $125M for AI safety · 2023-06-13T00:08:14.683Z · LW · GW

While I agree being led by someone who is aware of AI safety is a positive sign, I note that OpenAI is led by Sam Altman who similarly showed awareness of AI safety issues.

Comment by sanxiyn on [deleted post] 2023-06-11T03:56:37.776Z

I did the obvious thing and it worked? I have a suspicion you haven't tried hard enough, but indeed we all have comparative advantages.

Comment by sanxiyn on InternLM - China's Best (Unverified) · 2023-06-10T01:05:18.004Z · LW · GW

Parallelization part (data parallelism, tensor parallelism, pipeline parallelism, ZeRO) is completely standard. See Efficient Training on Multiple GPUs by Hugging Face for a standard description. Failure recovery part is relatively unusual.

Comment by sanxiyn on LEAst-squares Concept Erasure (LEACE) · 2023-06-08T07:56:53.996Z · LW · GW

That is trivial to program? For example, you can have AutoGPT UI which lists pending tasks with icons next to them, where clicking a trashcan will completely erase it from the context. That doesn't need any LLM-level help like LEACE.

Comment by sanxiyn on LEAst-squares Concept Erasure (LEACE) · 2023-06-08T07:44:43.772Z · LW · GW

What do you mean? Current LLMs are stateless. If unsuccessful attempts to solve the task are made, just reset the history and retry.

Comment by sanxiyn on One implementation of regulatory GPU restrictions · 2023-06-05T01:49:04.603Z · LW · GW

There is no problem with air gap. Public key cryptography is a wonderful thing. Let there be a license file, which is a signed statement of hardware ID and duration for which license is valid. You need private key to produce a license file, but public key can be used to verify it. Publish a license server which can verify license files and can be run inside air gapped networks. Done.

Comment by sanxiyn on The AGI Race Between the US and China Doesn’t Exist. · 2023-06-05T01:31:47.676Z · LW · GW

I note that this is how Falcon from Abu Dhabi was trained. To quote:

Falcon is a 40 billion parameters autoregressive decoder-only model trained on 1 trillion tokens. It was trained on 384 GPUs on AWS over the course of two months.

Comment by sanxiyn on Humans, chimpanzees and other animals · 2023-06-01T01:55:12.220Z · LW · GW

I think bow and arrow is powerful enough and gun is not necessary.

Comment by sanxiyn on What's the consensus on porn? · 2023-05-31T10:33:22.458Z · LW · GW

As an example of question specific enough to be answerable by science, there is Is Pornography Use Associated with Sexual Difficulties and Dysfunctions among Younger Heterosexual Men? (2015). It begins:

Recent epidemiological studies reported high prevalence rates of erectile dysfunction (ED) among younger heterosexual men (≤40). It has been suggested that this "epidemic" of ED is related to increased pornography use. However, empirical evidence for such association is currently lacking.

The answer is no. As far as I know, this was among the first study powerful enough to answer this question. Well done, science!

Of course, nobody listens to science. Compare the introduction above with another introduction written 4 years later, from Is Pornography Use Related to Erectile Functioning? (2019).

Despite evidence to the contrary, a number of advocacy and self-help groups persist in claiming that internet pornography use is driving an epidemic of erectile dysfunction (ED).

The shift in tone is palpable, and you can just feel the powerlessness researchers feel about the situation.

Comment by sanxiyn on The way AGI wins could look very stupid · 2023-05-13T15:55:33.261Z · LW · GW

Since the topic of chess was brought up: I think the right intuition pump is endgame tablebase, not moves played by AlphaZero. A quote about KRNKNN mate-in-262 discovered by endgame tablebase from Wikipedia:

Playing over these moves is an eerie experience. They are not human; a grandmaster does not understand them any better than someone who has learned chess yesterday. The knights jump, the kings orbit, the sun goes down, and every move is the truth. It's like being revealed the Meaning of Life, but it's in Estonian.

Comment by sanxiyn on I bet $500 on AI winning the IMO gold medal by 2026 · 2023-05-13T00:07:28.043Z · LW · GW

I agree timescale is a good way to think about this. My intuition is if high school math problems are 1 then IMO math problems are 100(1e2) and typical research math problems are 10,000(1e4). So exactly half way! I don't have first hand experience with hardest research math problems, but from what I heard about timescale they seem to reach 1,000,000(1e6). I'd rate typical practical R&D problems 1e3 and transformative R&D problems 1e5.

Edit: Using this scale, I rate GPT-3 at 1 and GPT-4 at 10. This suggests GPT-5 for IMO, which feels uncomfortable to me! Thinking about this, I think while there are lots of 1-data and 10-data, there are considerably less 100-data and above that most things are not written down. But maybe that is an excuse and it doesn't matter.

Comment by sanxiyn on I bet $500 on AI winning the IMO gold medal by 2026 · 2023-05-12T02:37:09.952Z · LW · GW

I kind of disagree. (I was on South Korean IMO team.) I agree IMO problems are in similar category of tasks including research math than high school math, but since IMO problems are intended to be solvable within a time limit, there is (quite low, in absolute sense) upper limit to their difficulty. Basically, intended solution is not longer than a single page. Research math problems have no such limit and can be arbitrarily difficult, or have a solution arbitrarily long.

Edit: Apart from time limit, length limit, and difficulty limit, another important aspect is that IMO problems are already solved, so known to be solvable. IMO problems are "Prove X". Research math problems, even if they are stated as "Prove X", is really "Prove or disprove X", and sometimes this matters.

Comment by sanxiyn on I bet $500 on AI winning the IMO gold medal by 2026 · 2023-05-12T02:28:43.847Z · LW · GW

Eh, there are not that many IMO problems, even including shortlisted problems. Since there are not that many, IMO contestants basically solve all previous IMO problems to practice. So it's not like AI is having an unfair advantage.

I am of the opinion that adding the condition of "not trained on prior IMO/math contest problems" is ridiculous.

Comment by sanxiyn on New OpenAI Paper - Language models can explain neurons in language models · 2023-05-11T06:50:04.813Z · LW · GW

GPT-6 will probably be able to analyze all the neurons in itself with >0.5 scores

This seems to assume the task (writing explanations for all neurons with >0.5 scores) is possible at all, which is doubtful. Superposition and polysemanticity are certainly things that actually happen.

Comment by sanxiyn on Geoff Hinton Quits Google · 2023-05-04T01:47:22.666Z · LW · GW

I note that Eliezer did this (pretty much immediately) on Twitter.

Comment by sanxiyn on The Rocket Alignment Problem, Part 2 · 2023-05-02T05:24:17.400Z · LW · GW

Of course such approaches are suggested, for example LOVE in a simbox is all you need. The main argument has been whether the simulation can be realistic, and whether it can be secure.

Comment by sanxiyn on AI doom from an LLM-plateau-ist perspective · 2023-04-28T01:08:23.409Z · LW · GW

I would not describe development of deep learning as discontinuous, but I would describe it as fast. As far as I can tell, development of deep learning happened by accumulation of many small improvements over time, sometimes humorously described as graduate student descent (better initialization, better activation function, better optimizer, better architecture, better regularization, etc.). It seems possible or even probable that brain-inspired RL could follow the similar trajectory once it took off, absent interventions like changes to open publishing norm.

Comment by sanxiyn on AI chatbots don't know why they did it · 2023-04-27T08:39:32.846Z · LW · GW

I think the primary difficulty is how to train it. GPT is trained from internet texts, but internet does not record memory of authors of those texts, so memory is unavailable in the training data set.

Comment by sanxiyn on Sama Says the Age of Giant AI Models is Already Over · 2023-04-17T21:28:54.281Z · LW · GW

As far as I can tell, Sam is saying no to size. That does not mean saying no to compute, data, or scaling.

"Hundreds of complicated things" comment definitely can't be interpreted to be against transformers, since "simply" scaling transformers fits the description perfectly. "Simply" scaling transformers involves things like writing a new compiler. It is simple in strategy, not in execution.

Comment by sanxiyn on GPTs are Predictors, not Imitators · 2023-04-10T11:15:45.184Z · LW · GW

Ask GPT to hash you a word (let alone guess which word was the origin of a hash), it'll just put together some hash-like string of tokens. It's got the right length and the right character set (namely, it's a hex number), but otherwise, it's nonsense.

But GPT can do base64 encoding. So what is the difference?

Comment by sanxiyn on No Summer Harvest: Why AI Development Won't Pause · 2023-04-10T02:05:32.559Z · LW · GW

I also am not sure it is enough to change the conclusion, but I am pretty sure "put ChatGPT to Bing" doesn't work as a business strategy due to inference cost. You seem to think otherwise, so I am interested in a discussion.

Inference cost is secret. The primary sources are OpenAI pricing table (ChatGPT 3.5 is 0.2 cents per 1000 tokens, GPT-4 is 30x more expensive, GPT-4 with long context is 60x more expensive), Twitter conversation between Elon Musk and Sam Altman on cost ("single-digits cents per chat" as of December 2022), and OpenAI's claim of 90% cost reduction since December. From this I conclude OpenAI is selling API calls at cost or at loss, almost certainly not at profit.

Dylan Patel's SemiAnalysis is a well respected publication on business analysis of semiconductor industry. In The Inference Cost Of Search Disruption, he estimates the cost per query at 0.36 cents. He also wrote a sequel on cost structure of search business, which I recommend. Dylan also points out simply serving ChatGPT for every query at Google would require $100B in capital investment, which clearly dominates other expenditures. I think Dylan is broadly right, and if you think he is wrong, I am interested in your opinions where.

Comment by sanxiyn on No Summer Harvest: Why AI Development Won't Pause · 2023-04-06T05:21:23.619Z · LW · GW

Economic cost-benefit analysis of training SOTA model seems entirely wrong to me.

If training a new SOTA model enabled a company to gain just a fraction of the global search market

This is admirably concrete, so I will use this, but the point generalizes. This assumes Microsoft can gain (and keep) 1% of the global search market by spending $1B USD and training a new SOTA model, which is obviously false? After training a new SOTA model, they need to deploy it for inference, and inference cost dominates training cost. The analysis seems to assume inference cost is negligible, but that's just not true for search engine case which requires wide deployment. The analysis should either give an example of economic gain that does not require wide deployment (things like stock picking comes to mind), or should analyze inference cost, at the very least it should not assume inference cost is approximately zero.

Comment by sanxiyn on Nobody’s on the ball on AGI alignment · 2023-03-31T06:23:05.593Z · LW · GW

I am interested in examples of non-empirical (theoretically based) deep learning progress.

Comment by sanxiyn on Pausing AI Developments Isn't Enough. We Need to Shut it All Down by Eliezer Yudkowsky · 2023-03-30T23:23:06.680Z · LW · GW

Recent Adversarial Policies Beat Superhuman Go AIs seem to plant doubt how well abstractions generalize in the case of Go.

Comment by sanxiyn on Pausing AI Developments Isn't Enough. We Need to Shut it All Down by Eliezer Yudkowsky · 2023-03-30T06:13:42.297Z · LW · GW

Eh, I agree it is not mathematically possible to break one time pad (but it is important to remember NSA broke VENONA, mathematical cryptosystems are not same as their implementations in reality), but most of our cryptographic proofs are conditional and rely on assumptions. For example, I don't see what is mathematically impossible about breaking AES.

Comment by sanxiyn on FLI open letter: Pause giant AI experiments · 2023-03-29T23:06:16.051Z · LW · GW

To state the obvious, pause narrows the lead with less ethical competitors only if pause is not enforced against less ethical competitors. I don't think anyone is in favor of unenforced pause: that would be indeed stupid, as the basic game theory says.

My impression is that we disagree on how feasible it is to enforce the pause. In my opinion, at the moment, it is pretty feasible, because there simply are not so many competitors. Doing an LLM training run is a rare capability now. Things are fragile and I am in fact unsure whether it would be feasible next year.

Comment by sanxiyn on FLI open letter: Pause giant AI experiments · 2023-03-29T22:23:45.926Z · LW · GW

I am saying it is Chinese government's interest for Chinese labs to slow down, as well as other labs. I am curious which part you disagree:

a) Chinese government prioritizes social stability over technological development (my assessment: virtually certain)
b) Chinese government is concerned technology like ChatGPT is a threat to social stability (my assessment: very likely, and they are in fact correct about this)
c) Chinese government will need some time to prepare to neutralize technology like ChatGPT as a threat to social stability, as they neutralized Internet with Great Firewall (my assessment: very likely, they got surprised by pace of development as everyone else did)

Comment by sanxiyn on FLI open letter: Pause giant AI experiments · 2023-03-29T06:56:12.320Z · LW · GW

China also seems to be quite far behind the west in terms of LLM

This doesn't match my impression. For example, THUDM(Tsing Hua University Data Mining lab) is one of the most impressive group in the world in terms of actually doing large LLM training runs. 

Comment by sanxiyn on FLI open letter: Pause giant AI experiments · 2023-03-29T06:23:27.208Z · LW · GW

Why do you think China will ignore it? This is "it's going too fast, we need some time", and China also needs some time for all the same reason. For example, China is censoring Google with Great Firewall, so if Google is to be replaced by ChatGPT, they need time to prepare to censor ChatGPT. Great Firewall wasn't built in a day. See Father of China's Great Firewall raises concerns about ChatGPT-like services from SCMP.

Comment by sanxiyn on FLI open letter: Pause giant AI experiments · 2023-03-29T06:14:15.634Z · LW · GW

"Don't do anything for 6 months" is a ridiculous exaggeration. The proposal is to stop training for 6 months. You can do research on smaller models without training the large one.

I agree it is debatable whether "beyond a reasonable doubt" standard is appropriate, but it seems entirely sane to pause for 6 months and use that time to, for example, discuss which standard is appropriate.

Other arguments you made seem to say "we shouldn't cooperate if the other side defects", and I agree, that's game theory 101, but that's not an argument against cooperating? If you are saying anything more, please elaborate.

Comment by sanxiyn on FLI open letter: Pause giant AI experiments · 2023-03-29T06:00:28.411Z · LW · GW

I mean, the letter is asking for six months, so it seems reasonable.

Comment by sanxiyn on What If: An Earthquake in Taiwan? · 2023-03-27T11:11:37.187Z · LW · GW

We don't need to guess, because 2016 southern Taiwan earthquake happened on 2016-02-06. Here is the official press release from TSMC: TSMC Details Earthquake Impact, Updates 1Q'16 Guidance. This should provide the baseline for the case of typical big earthquakes in Taiwan.

I would summarize this as: TSMC has three big fab sites in Taiwan: Hsinchu, Taichung, and Tainan. (You can basically think these as northern/central/southern Taiwan.) Earthquakes affecting one site won't affect other sites, and three sites are of similar size. So the expected effect is 1/3 capacity delayed up to 50 days.

In a worse case, all wafers being processed could have been trashed. This should produce delay equal to cycle time, which is very secret, but the usual estimate is 100 days. (1 day per layer, and 100 layers in the latest process.) In an even worse case, all fabs could have been trashed. This should produce delay equal to construction time, which historically has been roughly three years.

Comment by sanxiyn on GPT-4 · 2023-03-15T17:43:17.861Z · LW · GW

AP exams are scored on a scale of 1 to 5, so yes, getting the exact same score with zero difference makes sense.

Comment by sanxiyn on OpenAI introduce ChatGPT API at 1/10th the previous $/token · 2023-03-02T03:27:25.088Z · LW · GW

Any idea what those optimizations are? I am drawing a blank.

Comment by sanxiyn on GPT-4 Predictions · 2023-02-22T17:11:34.459Z · LW · GW

my rough guess is that GPT-4 will have twice the context length: 8192 tokens.

There is a Twitter rumor, supposedly based on a document leaked from OpenAI, which implies GPT-4 will have the context length of at least 32K(!).

Comment by sanxiyn on How should AI systems behave, and who should decide? [OpenAI blog] · 2023-02-17T02:35:14.332Z · LW · GW

Note that OpenAI already provides fine-tuning API and it's not difficult or expensive to use the API to influence AI's values. See RightWingGPT for an example.

RightWingGPT post also demonstrates that despite OpenAI's insistence "our guidelines are explicit that reviewers should not favor any political group", ChatGPT has clear political bias and the process is failing. (Or, more likely, the process is working as designed and OpenAI is lying here.)

Comment by sanxiyn on "Heretical Thoughts on AI" by Eli Dourado · 2023-01-20T07:14:42.747Z · LW · GW

You are wrong. Of course TFP is calculated based on real GDP, otherwise it would be meaningless.

There are real issues with how to measure real GDP, because price index needs to be adjusted for quality. But that's different from calculating based on nominal GDP. As I understand, "the same product getting cheaper" you mentioned is nearly perfectly captured by current methods. What Jensen mentioned is different, it's "the same cost buying more", and that's more problematic.

Comment by sanxiyn on Let’s think about slowing down AI · 2022-12-26T02:49:25.597Z · LW · GW

I completely agree and this seems good? I very much want to ally with unproductive rent-seekers and idiots to reduce existential risk. Thanks a lot, unproductive rent-seekers and idiots! (though I most certainly shouldn't call them that to ally with them). I don't understand how this is in any way a strong case against the proposition.

Comment by sanxiyn on Let’s think about slowing down AI · 2022-12-24T02:11:13.304Z · LW · GW

Agreed. On the other hand, what I read suggests He Jiankui was bottlenecked on parental consent. For his first-in-human trial, he couldn't recruit any parents interested in editing PCSK9, but some parents, themselves HIV patients, whose contacts were relatively easily acquired from HIV support group, really really cared about (as you pointed out, and I agree, incorrectly) editing CCR5, and were easily recruited. It sometimes happens recruiting participants is the limiting factor in doing trials, and I think it was the case here.

Comment by sanxiyn on Let’s think about slowing down AI · 2022-12-23T13:20:55.726Z · LW · GW

Very interesting! Recently, US started to regulate export of computing power to China. Do you expect this to speed up AGI timeline in China, or do you expect regulation to be ineffective, or something else?

Reportedly, NVIDIA developed A800, which is just A100, to keep the letter but probably not the spirit of the regulation. I am trying to follow closely how A800 fares, because it seems to be an important data point on feasibility of regulating computing power.

Comment by sanxiyn on [deleted post] 2022-12-23T12:16:15.279Z

That doesn't seem to match history. People gladly do expensive hash table name lookup (as in Python object) even if simple addition (as in C struct) is sufficient. Of course people will setup gigantic models even if the problem admits a simple linear algorithm.

Comment by sanxiyn on [deleted post] 2022-12-23T04:07:37.706Z

Code generation will be almost universally automated

I must note that code generation is already almost universally automated: practically nobody writes assembly, they are almost always generated by compilers, but no, compilers didn't end the programming.

Comment by sanxiyn on [deleted post] 2022-12-23T04:02:23.400Z

No doubt the earliest pioneers of computer science, emerging from the (relatively) primitive cave of electrical engineering, stridently believed that all future computer scientists would need to command a deep understanding of semiconductors, binary arithmetic, and microprocessor design to understand software.

This is disappointing, although not unexpected. Computer scientists, in general, are tragically bad at the history of their own field, although all of science is like that and it is not specific to computer science. Compare Alan Turing, who wrote in 1945, before there was any actual computer to program:

This process of constructing instruction tables should be very fascinating. There need be no real danger of it ever becoming a drudge, for any processes that are quite mechanical may be turned over to the machine itself.

That is, the earliest pioneers of computer science thought about automating programming tasks, before any of actual programming. As far as I know zero actual pioneers of computer science believed programmers would need to understand hardware design to understand software. It would be especially unlike for Alan Turing, since the big deal about Turing machine was that universal Turing machine can be built such that underlying computing substrate can be ignored. Misconceptions like this seem specific to people like Matt Welsh who came after the field was born and didn't bother to study the history of how the field was born.

Comment by sanxiyn on Let’s think about slowing down AI · 2022-12-23T03:17:14.771Z · LW · GW

This seems to suggest "should we relax nuclear power regulation 1% less expensive to comply?" as a promising way to fix economics of nuclear power, and I don't buy that at all. Maybe it's different because Chernobyl happened, and the movie like The China Syndrome was made about nuclear accident?

That sounds very hopeful to me but doesn't seem true to me. It implies slowing down AI will be easy, it just needs Chernobyl-sized disaster and a good movie about it. Chernobyl disaster was nearly harmless compared to COVID-19, and even COVID-19 was hardly an existential threat. If slowing down AI is this easy we probably shouldn't waste time worrying about it before Chernobyl.