Ben Millwood's Shortform

ben-millwood

Ben Millwood's Shortform

post by Ben Millwood (ben-millwood) · 2024-07-15T14:42:42.777Z · LW · GW · 10 comments

10 comments

10 comments

Comments sorted by top scores.

comment by Ben Millwood (ben-millwood) · 2024-07-22T21:48:39.128Z · LW(p) · GW(p)

xAI has ambitions to compete with OpenAI and DeepMind, but I don't feel like it has the same presence in the AI safety discourse. I don't know anything about its attitude to safety, or how serious a competitor it is. Are there good reasons it doesn't get talked about? Should we be paying it more attention?

Replies from: tylerjohnston, Vladimir_Nesov, Vladimir_Nesov, Vladimir_Nesov

↑ comment by tylerjohnston · 2024-07-23T00:37:07.199Z · LW(p) · GW(p)

I've asked similar questions before and heard a few things. I also have a few personal thoughts that I thought I'd share here unprompted. This topic is pretty relevant for me so I'd be interested in what specific claims in both categories people agree/disagree with.

Things I've heard:

There's some skepticism about how well-positioned xAI actually is to compete with leading labs, because although they have a lot of capital and ability to fundraise, lots of the main bottlenecks right now can't simply be solved by throwing more money at the problem. E.g. building infrastructure, securing power contracts, hiring top engineers, accessing huge amounts of data, and building on past work are all pretty limited by non-financial factors, and therefore the incumbents have lots of advantages. That being said, it's placed alongside Meta and Google in the highest liquidity prediction market I could find about this asking which labs will be "top 3" in 2025.
There's some optimism about their attitude to safety since Elon has been talking about catastrophic risks from AI in no uncertain terms for a long time. There's also some optimism coming from the fact that he/xAI opted to appoint Dan Hendrycks as an advisor.

Personal thoughts:

I'm not that convinced that they will take safety seriously by default. Elon's personal beliefs seem to be hard to pin down/constantly shifting, and honestly, he hasn't seemed to be doing that well to me recently. He's long had a belief that the SpaceX project is all about getting humanity off Earth before we kill ourselves, and I could see a similar attitude leading to the "build ASI asap to get us through the time of perils" approach that I know others at top AI labs have (if he doesn't feel this way already).
I also think (~65%) it was a strategic blunder for Dan Hendrycks to take a public position there. If there's anything I took away from the OpenAI meltdown, it's a greater belief in something like "AI Safety realpolitik;" that is, when the chips are down, all that matters is who actually has the raw power. Fancy titles mean nothing, personal relationships mean nothing, heck, being a literal director of the organization means nothing, all that matters is where the money and infrastructure and talent is. So I don't think the advisor position will mean much, and I do think it will terribly complicate CAIS' efforts to appear neutral, lobby via their 501c4, etc. I have no special insight here so I hope I'm missing something, or that the position does lead to a positive influence on their safety practices that wouldn't have been achieved by unofficial/ad-hoc advising.
I think most AI safety discourse is overly focused on the top 4 labs (OpenAI, Anthropic, Google, and Meta) and underfocused on international players, traditional big tech (Microsoft, Amazon, Apple, Samsung), and startups (especially those building high-risk systems like highly-technical domain specialists and agents). Similarly, I think xAI gets less attention than it should.

↑ comment by Vladimir_Nesov · 2024-07-26T07:05:40.200Z · LW(p) · GW(p)

A new Bloomberg article says xAI is building a datacenter in Memphis, planned to become operational by the end of 2025, mentioning a new-to-me detail that the datacenter targets 150 megawatts (more details on DCD). This means the scale of 100,000 GPUs or $4 billion in infrastructure, a bulk of its recently secured $6 billion from Series B.

This should be good for training runs that could be said to cost $1 billion in cost of time (lasting a few months). And Dario Amodei is saying that this is the scale of today, for models that are not yet deployed. This puts xAI at 18 months behind, a difficult place to rebound from unless long-horizon task capable AI that can do many jobs (a commercially crucial threshold that is not quite AGI) is many more years away.

↑ comment by Vladimir_Nesov · 2024-08-29T20:04:51.549Z · LW(p) · GW(p)

It seems the 100K H100s for the Memphis datacenter can plausibly get online around the end of 2024, and planned release of Grok-3 gives additional indirect evidence [LW(p) · GW(p)] this might be the case. While OpenAI might have started training in May on a cluster that might have 100K H100s as well. So I'm updating my previous guess [LW(p) · GW(p)] of xAI being 18 months behind to them only being 7-9 months behind for the 100K H100s scale (above 4e26 FLOPs).

↑ comment by Vladimir_Nesov · 2024-07-23T04:00:54.761Z · LW(p) · GW(p)

For some reason current labs are not running $10 billion training runs already, didn't build the necessary datacenters immediately. It would take a million H100s and 1.5 gigawatts, supply issues seem likely. There is also a lot of engineering detail to iron out, so the scaling proceeds gradually.

But some of this might be risk aversion, unwillingness to waste capital where a slower pace makes a better use of it. As a new contender has no other choice, we'll get to see if it's possible to leapfrog scaling after all. And Musk has affinity with impossible deadlines (not necessarily with meeting them), so the experiment will at least be attempted.

comment by Ben Millwood (ben-millwood) · 2025-01-22T10:38:51.421Z · LW(p) · GW(p)

Given ambiguity about whether GitHub trains models on private repos, I wonder if there's demand for someone to host a public GitLab (or similar) instance that forbids training models on their repos, and takes appropriate countermeasures [LW · GW] against training data web scrapers accessing their public content.

Replies from: nathan-helm-burger

↑ comment by Nathan Helm-Burger (nathan-helm-burger) · 2025-01-22T12:21:20.670Z · LW(p) · GW(p)

Yeah, for years I've been kinda shocked at how lax the security around private GitHub repos is. Seems like with code becoming a thing that can look innocent, but be upstream of a general purpose tool which is capable of producing recipes for novel weapons of mass destruction.... Yeah. We really gotta step up security.

comment by Ben Millwood (ben-millwood) · 2024-07-15T14:42:42.877Z · LW(p) · GW(p)

I wonder if anyone has considered or built prediction markets that can pay out repeatedly: an example could be "people who fill in this feedback form will say that they would recommend the event to others", and each response that says yes causes shorts to pay longs (or noes pay yesses) and vice versa.

You'd need some mechanism to cap losses. I guess one way to model it is as a series of markets of the form "the Nth response will say yes", and a convenient interface to trade in the first N markets at a single price. That way, after a few payouts your exposure automatically closes. That said, it might make more sense to close out after a specified number of losses, rather than a specified number of resolutions (i.e. no reason to cap the upside) but it's less clear to me whether that structure has any hidden complexity.

The advantages over a single market that resolves to a percentage of yesses are probably pretty marginal? Most significant where there isn't going to be an obvious end time, but I don't have any examples of that immediately.

In general there's a big space of functions from consequences to payouts. Most of them probably don't make good "products", but maybe more do than are currently explored.

Replies from: Dagon

↑ comment by Dagon · 2024-07-15T15:59:58.171Z · LW(p) · GW(p)

In markets like these, "cap losses" is equivalent to "cap wins" - the actual money is zero-sum, right? There certainly exist wagers that scale ($10 per point difference, on a sporting event, for instance), and a LOT of financial investing has this structure (stocks have no theoretical maximum value).

I think your capping mechanism gets most of the value - maybe not "the Nth response is yes", but markets for a couple different sizes of vote counts, with thresholds for averages. "wins if over 10,000 responses with 65% yes, loses if over 10,000 less than 65% yes, money returned if fewer than 10,000 responses", with a number of wagers allowed with different size limits.

Replies from: ben-millwood

↑ comment by Ben Millwood (ben-millwood) · 2024-07-15T22:10:39.702Z · LW(p) · GW(p)

In markets like these, "cap losses" is equivalent to "cap wins" - the actual money is zero-sum, right?

Overall, yes, per-participant no. For example, if everyone caps their loss at $1 I can still win $10 by betting against ten different people, though of course only at most 1 in 11 market participants will be able to do this.

There certainly exist wagers that scale ($10 per point difference, on a sporting event, for instance), and a LOT of financial investing has this structure (stocks have no theoretical maximum value).

Yeah, although the prototypical prediction market has contracts with two possible valuations, even existing prediction markets also support contracts that settle to a specific value. The thing that felt new to me about the idea I had was that you could have prediction contracts that pay out at times other than the end of their life, though it's unclear to me whether this is actually more expressive than packaged portfolios of binary, payout-then-disappear contracts.

(Portfolios of derivatives that can be traded atomically are nontrivially more useful than only being able to trade one "leg" at a time, and are another thing that exist in traditional finance but mostly don't exist in prediction markets. My impression there, though, is that these composite derivatives are often just a marketing ploy by banks to sell clients things that are tricky to price accurately, so they can hide a bigger markup on them; I'm not sure a more co-operative market would bother with them.)

Ben Millwood's Shortform

Contents

10 comments