November-December 2024 Progress in Guaranteed Safe AI

quinn-dougherty

November-December 2024 Progress in Guaranteed Safe AI

post by Quinn (quinn-dougherty) · 2025-01-22T01:20:00.868Z · LW · GW · 0 comments

  FrontierMath (hiring)
    See also
    Opportunity
  ARIA call: Mathematics for Safe AI (funding)
    See also
  Structure-aware version control via observational bridge types (hiring)
    See also
    Opportunity
  Squiggle AI
None
No comments

Sorry for the radio silence last month. It was slow and I didn’t come across things I wanted to write about, to be expected with holidays coming up.

There are no benefits of paying, except you get a cut of my hard earned shapley points, and apparently some disappointment when I miss a month.

If you're just joining us, background on GSAI here.

There should be some DeepSeek/r1 coverage in the next newsletter (I think tanking the cost of finetuning is, while dual use, good news for us in GSAI). I’d rather get this one out the door though.

FrontierMath (hiring)

As I’m always banging on about, progress in math automation is a leading indicator for progress in software engineering automation by the curry-howard correspondence. That’s a little cheeky, but last time I said it in meatspace I got solemn nods as if it wasn’t even cheeky at all. Or maybe they were solemnly nodding at something else. To be explicit, mathematical reasoning abilities and software engineering skills (including formal proof engineering) appear to be at similar capability levels. Furthermore, through platforms like Lean, mathematical ability can enhance software development capabilities. This connection should make it easier to implement rigorous quality assurance processes, such as obtaining formal proof certificates, which are typically costly.

We introduce FrontierMath, a benchmark of hundreds of original, exceptionally challenging mathematics problems crafted and vetted by expert mathematicians. The questions cover most major branches of modern mathematics -- from computationally intensive problems in number theory and real analysis to abstract questions in algebraic geometry and category theory. Solving a typical problem requires multiple hours of effort from a researcher in the relevant branch of mathematics, and for the upper end questions, multiple days. FrontierMath uses new, unpublished problems and automated verification to reliably evaluate models while minimizing risk of data contamination. Current state-of-the-art AI models solve under 2% of problems, revealing a vast gap between AI capabilities and the prowess of the mathematical community. As AI systems advance toward expert-level mathematical abilities, FrontierMath offers a rigorous testbed that quantifies their progress.

The story here of course is that by the time the newsletter covered it, Open AI hit 25%. “Mathematicians assemble a benchmark they can barely solve themselves at SOTA 2%, forecasters think it’ll be unsaturated for multiple years, a month later Open AI hits 25%” is a fun little anecdote for Christmas with the relatives, for a funny sense of fun. But, points out an astute reader, since we have typical Open AI science communication (read: extremely disappointing science communication) here: what sort of inference compute was thrown at it? At what cost? A model that can win with the most naive prompt is more impressive than a model that only wins with gal-brained and/or expensive (in tokens) feedback schemes.

So anyways, the story here of course is that after I drafted the previous paragraph, people noticed that Epoch sneakily added their funding disclosure to a new draft of the paper a month after all the hoopla had started to die down. Spoiler alert, the funder was, lol, Open AI. I’ll refer you to LessWrong user 7vik’s summary of events [LW · GW], and not get too far into it.

But I have feelings. What if the externality of mathematical automation and tooling in the form of progress in guaranteed safe AI doesn’t pan out? Then Open AI gets it’s “put terry tao out of a job” boyscout badge for all the employee laptops and the safety community gets nothing in return! Come on guys, why is this even in Epoch’s mandate to begin with? Makes me sad.

Opportunity

FrontierMath begins in three tiers of difficulty (25% undergrad or IMO level, 50% grad student, 25% early professor-track researcher), but they’re currently expanding to a fourth tier of even harder problems, and they also want some formalization in Lean, which you email elliot at epochai.org about if you’d like to partake.

ARIA call: Mathematics for Safe AI (funding)

This round is affiliated with the Safeguarded AI programme, but is outside of the standard technical areas breakdown.

Structure-aware version control via observational bridge types (hiring)

David Jaz Myers writes for the topos blog about structure-aware version control.

This is really exciting, because git diffs as we know them would lead to problems if you were scaling world models (in a davidad-style approach) or specifications (in any GSAI approach including davidad) that were being collaborated on with many humans and AIs.

But suppose that you and I are editing a csv, and you add a row to the bottom and I add a column to the end. Git would see your change as a single line diff, whereas my change (adding a column) is a change to every line; these conflict in the new line you added. But from what we know about the structure of csv, your change should really be a one-row change, mine a one-column change, and the conflict should only occur in one cell.

The author proposes something called observational bridge types (from the up and coming proof assistant Narya) to form the foundation of structure-aware version control. Using these, we can say that, for filetype F and files f1 f2 : F, a diffing algorithm is none other than any inhabitant of the type Diff F f1 f2 (where Diff : (A : Type) -> A -> A -> Type is created using something called logical relations, or an inductive definition depending on a type). Then, conflicts are defined as a pair (d1, d2) : Diff F f1 f2 x Diff F f1 f3 so that merges may be defined as another pair (d3, d4) : Diff F f2 f4 x Diff F f3 f4. That much is roughly consistent with git, provided that you assume F (the filetype) is always “list[list[char]]” or some notion of raw text. It’d be great to not have to assume that, even just for the csv example, to say nothing of more complicated structures in probabilistic semantics or complex/chaotic systems. A system that can also reason about diffs between the types themselves (since Diff Type sigma tau is perfectly reasonable here) is also desirable.

Opportunity

To work with David Jaz on this for the world modeling part of Safeguarded AI in Oxford, apply here.

Squiggle AI

In the davidad and somewhat Bengio regions of the GSAI space, there’s a huge emphasis on world models– computational descriptions of what is. Squiggle is a programming language where the terms are distributions, and all monte carlo boilerplate is hidden from the user. This makes Squiggle a candidate “standard” for belief specifications, as I’ve written about elsewhere [LW · GW].

The first problem you’d run into is that the Users Will Not Just. The users will not just learn a programming language to specify beliefs when they barely want to specify their beliefs in the first place. That’s why it’s good news QURI shipped an AI product, so the interface is closer to natural language while the substrate/backend is squiggle. The LLM Will Just. The LLM will just learn to program in squiggle, so you don’t have to.

It’s ready for you to use it at squigglehub.org/ai (if authenticated), and you can read QURI’s press release on the EA Forum and elsewhere [EA · GW].

0 comments

Comments sorted by top scores.

November-December 2024 Progress in Guaranteed Safe AI

Contents

FrontierMath (hiring)

See also

Opportunity

ARIA call: Mathematics for Safe AI (funding)

See also

Structure-aware version control via observational bridge types (hiring)

See also

Opportunity

Squiggle AI

0 comments