Posts

Ought will host a factored cognition “Lab Meeting” 2022-09-09T23:46:08.412Z
Elicit: Language Models as Research Assistants 2022-04-09T14:56:37.763Z
Supervise Process, not Outcomes 2022-04-05T22:18:20.068Z
Beta test GPT-3 based research assistant 2020-12-16T13:42:50.432Z
Automating reasoning about the future at Ought 2020-11-09T21:51:14.353Z
Brainstorming positive visions of AI 2020-10-07T16:09:33.453Z

Comments

Comment by jungofthewon on Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible · 2022-09-12T00:03:46.907Z · LW · GW

Sure! Prior to this survey I would have thought:

  1. Fewer NLP researchers would have taken AGI seriously, identified understanding its risks as a significant priority, and considered it catastrophic. 
    1. I particularly found it interesting that underrepresented researcher groups were more concerned (though less surprising in hindsight, especially considering the diversity of interpretations of catastrophe). I wonder how well the alignment community is doing with outreach to those groups. 
  2. There were more scaling maximalists (like the survey respondents did)

I was also encouraged that the majority of people thought the majority of research is crap.

...Though not sure how that math exactly works out. Unless people are self-aware of their publishing crap :P

Comment by jungofthewon on (My understanding of) What Everyone in Technical Alignment is Doing and Why · 2022-09-01T18:47:26.723Z · LW · GW

All good, thanks for clarifying.

Comment by jungofthewon on Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible · 2022-09-01T16:05:40.453Z · LW · GW

This was really interesting, thanks for running and sharing! Overall this was a positive update for me. 

Results are here

I think this just links to PhilPapers not your survey results? 

Comment by jungofthewon on (My understanding of) What Everyone in Technical Alignment is Doing and Why · 2022-08-31T14:07:07.454Z · LW · GW

and Ought either builds AGI or strongly influences the organization that builds AGI.

 

"strongly influences the organization that builds AGI" applies to all alignment research initiatives right? Alignment researchers at e.g. DeepMind have less of an uphill battle but they still have to convince the rest of DeepMind to adopt their work. 

Comment by jungofthewon on Common misconceptions about OpenAI · 2022-08-25T20:23:13.488Z · LW · GW

I also appreciated reading this.

Comment by jungofthewon on Deliberate Grieving · 2022-08-20T16:32:08.669Z · LW · GW

I found this post beautiful and somber in a sacred way.  Thank you.

Comment by jungofthewon on How to do theoretical research, a personal perspective · 2022-08-20T16:28:11.169Z · LW · GW

This was really helpful and fun to read. I'm sure it was nontrivial to get to this level of articulation and clarity. Thanks for taking the time to package it for everyone else to benefit from. 

Comment by jungofthewon on Rant on Problem Factorization for Alignment · 2022-08-07T20:33:48.514Z · LW · GW

If anyone has questions for Ought specifically, we're happy to answer them as part of our AMA on Tuesday.

Comment by jungofthewon on Rant on Problem Factorization for Alignment · 2022-08-05T19:54:49.026Z · LW · GW

I think we could play an endless and uninteresting game of "find a real-world example for / against factorization."

To me, the more interesting discussion is around building better systems for updating on alignment research progress -   

  1. What would it look like for this research community to effectively update on results and progress? 
  2. What can we borrow from other academic disciplines? E.g. what would "preregistration" look like? 
  3. What are the ways more structure and standardization would be limiting / taking us further from truth? 
  4. What does the "institutional memory" system look like? 
  5. How do we coordinate the work of different alignment researchers and groups to maximize information value?
Comment by jungofthewon on Supervise Process, not Outcomes · 2022-04-12T02:43:00.166Z · LW · GW

Thanks for that pointer. It's always helpful to have analogies in other domains to take inspiration from.

Comment by jungofthewon on Learning By Writing · 2022-02-25T00:03:11.925Z · LW · GW

I enjoyed reading this, thanks for taking the time to organize your thoughts and convey them so clearly! I'm excited to think a bit about how we might imbue a process like this into Elicit

This also seems like the research version of being hypothesis-driven / actionable / decision-relevant at work. 

Comment by jungofthewon on Reflections on six months of fatherhood · 2022-02-02T23:21:04.891Z · LW · GW

love. very beautifully written. today i will also try to scoot n+1 inches. 

Comment by jungofthewon on Vitalik: Cryptoeconomics and X-Risk Researchers Should Listen to Each Other More · 2021-11-29T02:15:35.485Z · LW · GW

It does - ty!

Comment by jungofthewon on Vitalik: Cryptoeconomics and X-Risk Researchers Should Listen to Each Other More · 2021-11-29T00:14:34.710Z · LW · GW

I think the discord link is broken?

Comment by jungofthewon on How do we prepare for final crunch time? · 2021-04-01T14:55:31.097Z · LW · GW

 Access

Alignment-focused policymakers / policy researchers should also be in positions of influence. 

Knowledge

I'd add a bunch of human / social topics to your list e.g. 

  • Policy 
  • Every relevant historical precedent
  • Crisis management / global logistical coordination / negotiation
  • Psychology / media / marketing
  • Forecasting 

Research methodology / Scientific “rationality,” Productivity, Tools

I'd be really excited to have people use Elicit with this motivation. (More context here and here.)

Re: competitive games of introducing new tools, we did an internal speed Elicit vs. Google test to see which tool was more efficient for finding answers or mapping out a new domain in 5 minutes. We're broadly excited to structure and support competitive knowledge work and optimize research this way. 

Comment by jungofthewon on The case for aligning narrowly superhuman models · 2021-03-19T04:32:50.204Z · LW · GW

This is exactly what Ought is doing as we build Elicit into a research assistant using language models / GPT-3. We're studying researchers' workflows and identifying ways to productize or automate parts of them. In that process, we have to figure out how to turn GPT-3, a generalist by default, into a specialist that is a useful thought partner for domains like AI policy. We have to learn how to take feedback from the researcher and convert it into better results within session, per person, per research task, across the entire product. Another spin on it: we have to figure out how researchers can use GPT-3 to become expert-like in new domains. 

We’re currently using GPT-3 for classification e.g. “take this spreadsheet and determine whether each entity in Column A is a non-profit, government entity, or company.” Some concrete examples of alignment-related work that have come up as we build this: 

  • One idea for making classification work is to have users generate explanations for their classifications. Then have GPT-3 generate explanations for the unlabeled objects. Then classify based on those explanations. This seems like a step towards “have models explain what they are doing.”
  • I don’t think we’ll do this in the near future but we could explore other ways to make GPT-3 internally consistent, for example:
    • Ask GPT-3 why it classified Harvard as a “center for innovation.”
    • Then ask GPT-3 if that reason is true for Microsoft.
      • Or just ask GPT-3 if Harvard is similar to Microsoft.
    • Then ask GPT-3 directly if Microsoft is a “center for innovation.”
    • And fine-tune results until we get to internal consistency.
  • We eventually want to apply classification to the systematic review (SR) process, or some lightweight version of it. In the SR process, there is one step where two human reviewers identify which of 1,000-10,000 publications should be included in the SR by reviewing the title and abstract of each paper. After narrowing it down to ~50, two human reviewers read the whole paper and decide which should be included. Getting GPT-3 to skip these two human processes but be as good as two experts reading the whole paper seems like the kind of sandwiching task described in this proposal.

We'd love to talk to people interested in exploring this approach to alignment!

Comment by jungofthewon on Open & Welcome Thread – March 2021 · 2021-03-10T02:23:42.660Z · LW · GW

Ought is building Elicit, an AI research assistant using language models to automate and scale parts of the research process. Today, researchers can brainstorm research questions, search for datasets, find relevant publications, and brainstorm scenarios.  They can create custom research tasks and search engines.  You can find demos of Elicit here and a podcast explaining our vision here.  

We're hiring for the following roles:

Each job description contains sample projects from our roadmap. 

Research is one of the primary engines by which society moves forward.  We're excited about the potential language models and ML have for making this engine orders of magnitude more effective. 

Comment by jungofthewon on 100 Tips for a Better Life · 2020-12-24T20:05:48.694Z · LW · GW

"Remember that you are dying."

Comment by jungofthewon on Beta test GPT-3 based research assistant · 2020-12-17T16:46:25.366Z · LW · GW

Do you have any examples?  

Comment by jungofthewon on Embedded Interactive Predictions on LessWrong · 2020-12-02T15:17:57.585Z · LW · GW

When Elicit has nice argument mapping (it doesn't yet, right?) it might be pretty cool and useful (to both LW and Ought) if that could be used on LW as well. For example, someone could make an argument in a post, and then have an Elicit map (involving several questions linked together) where LW users could reveal what they think of the premises, the conclusion, and the connection between them.

Yes that is very aligned with the type of things we're interested in!! 

Comment by jungofthewon on Embedded Interactive Predictions on LessWrong · 2020-11-25T23:58:35.582Z · LW · GW

Lots of uncertainty but a few ways this can connect to the long-term vision laid out in the blog post:

  1. We want to be useful for making forecasts broadly. If people want to make predictions on LW, we want to support that. We specifically want some people to make lots of predictions so that other people can reuse the predictions we house to answer new questions. The LW integration generates lots of predictions and funnels them into Elicit.  It can also teach us how to make predicting easier in ways that might generalize beyond LW. 
  2. It's unclear how exactly the LW community will use this integration but if they use it to decompose arguments or operationalize complex concepts, we can start to associate reasoning or argumentative context with predictions. It would be very cool if, given some paragraph of a LW post, we could predict what forecast should be embedded next, or how a certain claim should be operationalized into a prediction. Continuing the takeoffs debate and Non-Obstruction: A Simple Concept Motivating Corrigibility start to point at this. 
  3. There are versions of this integration that could involve richer commenting in the LW editor.
  4. Mostly it was a quick experiment that both teams were pretty excited about :) 
Comment by jungofthewon on Embedded Interactive Predictions on LessWrong · 2020-11-22T14:02:40.172Z · LW · GW

I see what you're saying. This feature is designed to support tracking changes in predictions primarily over longer periods of time e.g. for forecasts with years between creation and resolution. (You can even download a csv of the forecast data to run analyses on it.)

It can get a bit noisy, like in this case, so we can think about how to address that. 

Comment by jungofthewon on Embedded Interactive Predictions on LessWrong · 2020-11-21T18:57:18.239Z · LW · GW

you mean because my predictions are noisy and you don't want to see them in that list? 

Comment by jungofthewon on Embedded Interactive Predictions on LessWrong · 2020-11-20T21:51:06.686Z · LW · GW

try it and let's see what happens! 

Comment by jungofthewon on Embedded Interactive Predictions on LessWrong · 2020-11-20T20:42:55.229Z · LW · GW

this is too much fun to click on

Comment by jungofthewon on Brainstorming positive visions of AI · 2020-10-10T03:48:39.255Z · LW · GW

Haha I didn't find it patronizing personally but it did take me an embarrassingly long time to figure out what Filipe did there :) Resource allocation seems to be a common theme in this thread. 

Comment by jungofthewon on Brainstorming positive visions of AI · 2020-10-10T03:46:59.205Z · LW · GW

Yes! For example I am often amazed by people who are able to explain complex technical concepts in accessible and interesting ways 

Comment by jungofthewon on Brainstorming positive visions of AI · 2020-10-08T03:29:56.733Z · LW · GW

Yes-anding you: our limited ability to run "experiments" and easily get empirical results for policy initiatives seems to really hinder progress. Maybe AI can help us organize our values, simulate a bunch of policy outcomes, and then find the best win-win solution when our values diverge. 

Comment by jungofthewon on Brainstorming positive visions of AI · 2020-10-08T03:24:55.832Z · LW · GW

I love the idea of exploring different minds and seeing how they fit. Getting chills thinking about what it means for humanity's capacity for pleasure to explode. And loving the image of swimming through a vast, clear, blue mind design ocean.  

Comment by jungofthewon on Brainstorming positive visions of AI · 2020-10-08T03:20:31.284Z · LW · GW

Doesn't directly answer the question but: AI tools / assistants are often portrayed as having their own identities. They have their own names e.g. Samantha, Clara, Siri, Alexa. But it doesn't seem obvious that they need to be represented as discrete entities. Can an AI system be so integrated with me that it just feels like me on a really really really good day? Suddenly I'm just so knowledgeable and good at math! 

Comment by jungofthewon on Brainstorming positive visions of AI · 2020-10-07T16:13:37.129Z · LW · GW

Instant translation across nueroatypical people, just like instant translation between English and Korean. An AI system that helps me understand what an autistic individual is currently experiencing, helps me communicate more easily with them. 

Comment by jungofthewon on Brainstorming positive visions of AI · 2020-10-07T16:12:48.832Z · LW · GW

An interactive, conversational system that makes currently expensive and highly manual therapy much more accessible. Something that talks you through a cortisol spike, anxiety attack, panic attack. 

Comment by jungofthewon on Brainstorming positive visions of AI · 2020-10-07T16:10:48.494Z · LW · GW

I tweeted an idea earlier: A tool that explains in words you understand what the other person really meant. maybe has settings for "gently nudge me if i'm unfairly assuming negative intent"

Comment by jungofthewon on Forecasting Thread: AI Timelines · 2020-08-28T14:02:52.469Z · LW · GW

I generally agree with this but think the alternative goal of "make forecasting easier" is just as good, might actually make aggregate forecasts more accurate in the long run, and may require things that seemingly undermine the virtue of precision.

More concretely, if an underdefined question makes it easier for people to share whatever beliefs they already have, then facilitates rich conversation among those people, that's better than if a highly specific question prevents people from making a prediction at all. At least as much, if not more, of the value of making public, visual predictions like this comes from the ensuing conversation and feedback than from the precision of the forecasts themselves. 

Additionally, a lot of assumptions get made at the time the question is defined more precisely, which could prematurely limit the space of conversation or ideas. There are good reasons why different people define AGI the way they do, or the moment of "AGI arrival" the way they do, that might not come up if the question askers had taken a point of view. 

Comment by jungofthewon on Alex Irpan: "My AI Timelines Have Sped Up" · 2020-08-21T22:45:54.678Z · LW · GW

You want to change "Your Distribution" to something like "Daniel's 2020 distribution"? 

Comment by jungofthewon on Alex Irpan: "My AI Timelines Have Sped Up" · 2020-08-21T18:43:09.424Z · LW · GW

Yea this was a lot more obvious to me when I plotted visually: https://elicit.ought.org/builder/om4oCj7jm

(NB: I work on Elicit and it's still a WIP tool)