Announcing Squiggle: Early Access

post by ozziegooen · 2022-08-03T19:48:16.727Z · LW · GW · 7 comments

This is a link post for https://forum.effectivealtruism.org/posts/TfPdb2aMKzgWXFvc3/announcing-squiggle-early-access

Contents

  Introduction
  Work on Squiggle!
  Links
  What Squiggle is and is not
    What Squiggle Is
    What Squiggle Is Not
    Strengths
    Weaknesses
  Example: Piano Tuners
  Using Squiggle
      Playground
      Visual Studio Code Extension
      Typescript Library
      React Components Library
      Observable
  Public Squiggle Models
  Early Access
  Should effective altruists really fund and develop their own programming language?
  Questions & Answers
      What makes Squiggle better than Guesstimate?
      Should I trust Squiggle calculations for my nuclear power plant calculations?
      Can other communities use Squiggle?
      Is there an online Squiggle community?
  Future Work
  Contribute to Squiggle
  Organization and Funding
  Contributors
None
7 comments

Update: We've announced a $1,000 prize for early experimentation with Squiggle [EA · GW].

Introduction

Squiggle is a special-purpose programming language for probabilistic estimation[1]. Think: "Guesstimate as a programming language." Squiggle is free and open-source.

Our team has been using Squiggle for QURI estimations for the last few months and found it very helpful. The Future Fund recently began experimenting with Squiggle for estimating the value of different grants.

Now we're ready for others publicly to begin experimenting with it. The core API should be fairly stable; we plan to add functionality but intend to limit breaking changes.[2]

We'll do our best to summarize Squiggle for a diverse audience. If any of this seems intimidating, note that Squiggle can be used in ways not much more advanced than Guesstimate. If it looks too simple, feel free to skim or read the docs directly.

Work on Squiggle!

We're looking to hire people to work on Squiggle for the main tooling. We're also interested in volunteers or collaborators for the ecosystem (get in touch!). 

Public Website, Github Repo, Previous LessWrong Sequence [? · GW

What Squiggle is and is not

What Squiggle Is

What Squiggle Is Not

Strengths

Weaknesses

Example: Piano Tuners

Note: Feel free to skim this section, it's just to give a quick sense of what the language is.

Say you're estimating the number of piano tuners in New York City. You can build a simple model of this, like so.

// Piano tuners in NYC over the next 5 years
populationOfNewYork2022 = 8.1M to 8.4M // This means that you're 90% confident the value is between 8.1 and 8.4 Million.

proportionOfPopulationWithPianos = (.002 to 0.01) // We assume there are almost no people with multiple pianos

pianoTunersPerPiano = {
    pianosPerPianoTuner = 2k to 50k // This is artificially narrow, to help graphics later
    1 / pianosPerPianoTuner
} // This {} syntax is a block. Only the last line of it, "1 / pianosPerPianoTuner", is returned.

totalTunersIn2022 = (populationOfNewYork2022 * proportionOfPopulationWithPianos * pianoTunersPerPiano)

totalTunersIn2022

 

Now let's take this a bit further. Let's imagine that you think that NYC will rapidly grow over time, and you'd like to estimate the number of piano tuners for every point in time for the next few years.

// ...previous code
//Time in years after 2022
populationAtTime(t) = {
    averageYearlyPercentageChange = -0.01 to 0.05 // We're expecting NYC to continuously and rapidly grow. We model this as having a constant growth of between -1% and +5% per year.
    populationOfNewYork2022 * ((averageYearlyPercentageChange + 1) ^ t)
}
median(v) = quantile(v, .5)
totalTunersAtTime(t) = (populationAtTime(t) * proportionOfPopulationWithPianos * pianoTunersPerPiano)
{
    populationAtTime: populationAtTime,
    totalTunersAtTimeMedian: {|t| median(totalTunersAtTime(t))}
}

Using settings in the playground, we can show this over a 40-year period. 

You can play with this directly at the playground here.
 

Some Playground details

  1. If you hover over the populationAtTime variable, you can see the distribution at any point.
  2. You can change “sample count” to change the simulation size. It starts at 1,000, which is good for experimentation, but you’ll likely want to increase this number for final results. The above graphs used sampleCount=1,000,000.

If you want to get ambitious, you can. Consider changes like:

Using Squiggle

You can currently interact with Squiggle in a few ways:

Playground

The Squiggle Playground is a friendly tool for working with small models and making prototypes. You can make simple, shareable links.

Visual Studio Code Extension

There's a simple Virtual Studio Code extension for running and visualizing Squiggle code. We find that VS Code is a helpful editor for managing larger Squiggle setups.

(This example is a playful, rough experiment of an estimate of Ozzie's life output.)

Typescript Library

Squiggle is built using Rescript, and is accessible via a simple Typescript library. You can use this library to either run Squiggle code in full, or to call select specific functions within Squiggle (though this latter functionality is very minimal).

React Components Library

All of the components used in the playground and documentation are available in a separate component NPM repo. You can see the full Storybook of components here.

Observable

You can use Squiggle Components in Observable notebooks. Sam Nolan put together an exportable Observable Notebook of the main components. 

Public Squiggle Models

Early Access

We're calling this version "Early Access." It's more stable than some alphas, but we'd like it to be more robust before a public official beta. Squiggle is experimental, and we plan on trying out some exploratory use cases for effective altruist use cases for at least the next several months. You can think of it like some video games in "early access"; they often have lots of exciting functionality but deprioritize the polish that published games often have. Great for a smaller group of engaged users, bad for larger organizations that expect robust stability or accuracy.

Should effective altruists really fund and develop their own programming language?

It might seem absurd to create an "entire programming language" that's targeted at one community. However, after some experimentation and investigation, we concluded that it seems worth trying. Some points:

Questions & Answers

What makes Squiggle better than Guesstimate?

Guesstimate is great for small models but can break down for larger or more demanding models. Squiggle scales much better.

Plain text formats also allow for many other advantages. For example, you can use Github workflows with versioning, pull requests, and custom team access. 

Should I trust Squiggle calculations for my nuclear power plant calculations?

No! Squiggle is still early and meant much more for approximation than precision. If the numbers are critical, we recommend checking them with other software. See this list of key known bugs and this list of gotchas.

Can other communities use Squiggle?

Absolutely! We imagine it could be a good fit for many business applications. However, our primary focus will be on philanthropic use cases for the foreseeable future. 

Is there an online Squiggle community?

There's some public discussion on Github. There's also a private EA/epistemics Slack with a few channels for Squiggle; message Ozzie to join. 

Future Work

Again, we're looking to hire a small team to advance Squiggle and its ecosystem. Some upcoming items:

Contribute to Squiggle

If you'd like to contribute to Squiggle or the greater ecosystem, there's a lot of neat stuff to be done. 

If funding is particularly useful or necessary to any such endeavor, let us know.

Organization and Funding

Squiggle is one of the main projects of The Quantified Uncertainty Research Institute. QURI is a 501(c)(3) primarily funded by the LTFF, SFF, and Future Fund.

Contributors

Squiggle has very much been a collaborative effort. You can see an updating list of contributors here


Many thanks to Nuño Sempere, Quinn Dougherty, and Leopold Aschenbrenner for feedback on this post.

  1. ^

    Risk Analysis is a good term for Squiggle, but that term can be off-putting to those not used to it.

  2. ^

    Right now Squiggle is in version 0.3.0. When it hits major version numbers (1.0, 2.0), we might introduce breaking changes, but we’ll really try to limit things until then. It will likely be three to twenty months until we get to version 1.0. 

7 comments

Comments sorted by top scores.

comment by mako yass (MakoYass) · 2022-10-02T01:31:19.393Z · LW(p) · GW(p)

Feedback: I think the first thing I'm going to want to do with this is hand-draw some of my probability distributions, and it looks like it doesn't come with a thing for that? (Should I not want this feature? Am I misunderstanding something about what doing practical statistics is like?)

Replies from: ozziegooen
comment by ozziegooen · 2022-10-02T02:33:02.021Z · LW(p) · GW(p)

Yea; that's not a feature that exists yet. 

Thanks for the feedback!

comment by Raemon · 2022-08-04T03:24:53.721Z · LW(p) · GW(p)

Is squiggle built totally different from guesstimate? I could imagine a world where there's a guesstimate-like-intermediate layer that's designed to be more flexible.

Replies from: ozziegooen
comment by ozziegooen · 2022-08-04T03:36:09.625Z · LW(p) · GW(p)

Mostly. The core math bits of Guesstimate were a fairly thin layer on Math.js. Squiggle has replaced much of the MathJS reliance with custom code (custom interpreter + parser, extra distribution functionality). 

If things go well, I think it would make sense to later bring Squiggle in as the main language for Guesstimate models. This would be a breaking change, and quite a bit of work, but would make Guesstimate much more powerful. 

comment by Cedar (xida-ren) · 2022-08-05T19:29:24.376Z · LW(p) · GW(p)

For to(a,b), is there a way to specify other confidence intervals?

 

E.g. let's say I have the 25, 50, 75 percentiles, but not the 5 and 95 percentiles?

Replies from: ozziegooen
comment by ozziegooen · 2022-08-06T02:30:37.055Z · LW(p) · GW(p)

Not yet. There are a few different ways of specifying the distribution, but we don't yet have options for doing from the 25th&75th percentiles. It would be nice to do eventually. (Might be very doable to add in a PR, for a fairly motivated person). 
https://www.squiggle-language.com/docs/Api/Dist#normal

You can type in, normal({p5: 10, p95:30}). It should later be possible to say normal({p25: 10, p75:30}).

Separately; when you say "25, 50, 75 percentiles"; do you mean all at once? This would be an overspecification; you only need two points. Also; would you want this to work for normal/lognormal distributions, or anything else? 

Replies from: xida-ren
comment by Cedar (xida-ren) · 2022-08-06T03:32:25.788Z · LW(p) · GW(p)

25,50,75:

I'm thinking, just like how to can infer whether it's normal or lognormal, we can use one of the bell curve shaped distribution that gives a sort of closest approximation.

More generally, it'd be awesome if there a way to get the max entropy distribution given a bunch of statistics like quantiles or nsamples with min and max.