Posts

Building AI Research Fleets 2025-01-12T18:23:09.682Z
Provably Safe AI: Worldview and Projects 2024-08-09T23:21:02.763Z
Announcing FAR Labs, an AI safety coworking space 2023-09-29T16:52:37.753Z
Protectionism will Slow the Deployment of AI 2023-01-07T20:57:11.644Z
Jackpot! An AI Vignette 2021-07-08T20:32:22.528Z
What price would you pay for the RadVac Vaccine and why? 2020-08-22T00:17:56.155Z
Solving Math Problems by Relay 2020-07-17T15:32:00.985Z
Post-mortem on the Center for Long-Term Cybersecurity forecasts 2020-01-09T19:38:33.289Z
2020's Prediction Thread 2019-12-30T23:18:46.991Z
[Part 1] Amplifying generalist research via forecasting – Models of impact and challenges 2019-12-19T15:50:33.412Z
[Part 2] Amplifying generalist research via forecasting – results from a preliminary exploration 2019-12-19T15:49:45.901Z
[Link] John Carmack working on AGI 2019-11-14T00:08:37.250Z
bgold's Shortform 2019-10-17T22:18:11.822Z
Running Effective Structured Forecasting Sessions 2019-09-06T21:30:25.829Z
How to write good AI forecasting questions + Question Database (Forecasting infrastructure, part 3) 2019-09-03T14:50:59.288Z
AI Forecasting Resolution Council (Forecasting infrastructure, part 2) 2019-08-29T17:35:26.962Z
AI Forecasting Dictionary (Forecasting infrastructure, part 1) 2019-08-08T16:10:51.516Z
Do bond yield curve inversions really indicate there is likely to be a recession? 2019-07-10T01:23:36.250Z
What is the best online community for questions about AI capabilities? 2019-05-31T15:38:11.678Z
What's the best approach to curating a newsfeed to maximize useful contrasting POV? 2019-04-26T17:29:30.806Z

Comments

Comment by Ben Goldhaber (bgold) on Provably Safe AI: Worldview and Projects · 2025-04-01T20:34:40.203Z · LW · GW

fyi @Zac Hatfield-Dodds my probability has fallen below 10% - I expected at least one relevant physical<>cyber project to have started in the past six months, since it hasn't I doubt this will make the timeline. While not conceding (because I'm still unsure how far AI uplift alone gets us), seems right to note the update.

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2025-03-14T22:19:12.344Z · LW · GW

good to know thanks for flagging!

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2025-03-14T19:30:26.558Z · LW · GW

Recently learned about Acquired savant syndrome. https://en.wikipedia.org/wiki/Jason_Padgett

After the attack, Padgett felt "off." He assumed it was an effect of the medication he was prescribed; but it was later found that, because of his traumatic brain injury, Padgett had signs of obsessive–compulsive disorder and post-traumatic stress disorder.[5] He also began viewing the world through a figurative lens of mathematical shapes.

"Padgett is one of only 40 people in the world with “acquired savant syndrome,” a condition in which prodigious talents in math, art or music emerge in previously normal individuals following a brain injury or disease.

this makes it seem more likely to me that bio interventions for increases in IQ in adult humans is possible, though likely algernon's law holds and there's a cost.

h/t @Jesse Hoogland 

Comment by Ben Goldhaber (bgold) on In response to critiques of Guaranteed Safe AI · 2025-02-08T02:55:04.579Z · LW · GW

Minor point: It seems unfair to accuse GSAI of being vaporware. It has been less than a year since the GSAI paper came out and 1.5 since Tegmark/Omohundro's Provably Safe paper, and there are many projects being actively funded through ARIA and others that should serve as tests. No GSAI researchers that I know of promised significant projects in 2024 - in fact several explicitly think the goal should be to do deconfusion and conceptual work now and plan to leverage the advances in autoformalization and AI-assisted coding that are coming down the pipe fast. 

While I agree that there are not yet compelling demonstrations, this hardly seems at the level of Duke Nukem Forever!

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2025-01-31T19:31:21.743Z · LW · GW

what are the bottlenecks preventing 10x-100x scaling of Control Evaluations? 

  • I'm not confident in the estimates of the safety margin we get from internal only evaluations - the challenge of eliciting strong subversion performance seems very hard for getting satisfactory estimates of the subversion capability of models against control protocols.
  • I'd feel more confident if we had thousands of people trying to create red-team models, while thousands of blue teams propose different monitoring methods, and control protocols.
  • The type of experiments described in (1,2,3) seem particularly amenable to this type of crowdsourcing competitions.
  • This seems like it fits naturally into DefCon-esque (my catchall for the hacker community) security competitions; I think this would be very popular, even without a large monetary prize.
  • Success would look like having strong belief that a given control procedure has lower and upper bound estimates for catching the best attempts humanity can make at subverting control procedures w/ (red-teamed models, humans), and ideally have an estimate of the cost of such control protocols.
  • I think the reason this would fail is:
    • lack of access to the actual frontier models (perhaps a lab could partner with a competition organizer, or rely on good open source models)
    • lack of good data collection necessary to have confidence in the results or outcomes, metrics might be hard to meaningfully create.
    • would the competition reflect the local conditions of frontier labs well enough that labs can generalize from the results? Ideally you have a consortium effort helping to ensure the setup reflects reality.
    • generally operationally difficult to coordinate lots of people.

Are there others?

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2025-01-30T19:39:33.983Z · LW · GW

I think more leaders of orgs should be trying to shape their organizations incentives and cultures around the challenges of "crunch time". Examples of this include:

  • What does pay look like in a world where cognitive labor is automated in the next 5 to 15 years? Are there incentive structures (impact equity, actual equity, bespoke deals for specific scenarios) that can help team members survive, thrive, and stay on target?
  • What cultural norms should the team have to AI assisted work? On the one hand it seems necessary to accelerate safety progress, on the other I expect many applications are in fact trojan horses designed to automate people out of jobs (looking at you MSFT rewind) - are there credible deals to be made that can provide trust?
  • Does the organization expect to be rapidly changing to new events in AI - and if so how will sensemaking happen - or does it expect to make it's high conviction bet early on and stay the course through distractions? Do teammembers know that?

I have more questions than answers, but the background level of stress and disorientation for employees and managers will be rising, especially in AI Safety orgs, and starting to come up w/ contextually true answers (I doubt there's a universal answer) will be important.

Comment by Ben Goldhaber (bgold) on Davidad's Bold Plan for Alignment: An In-Depth Explanation · 2025-01-16T19:45:48.039Z · LW · GW

This post was one of my first introductions to davidad's agenda and convinced me that while yes it was crazy, it was maybe not impossible, and it led me to working on initiatives like the multi-author manifesto you mentioned. 

Thank you for writing it!

Comment by Ben Goldhaber (bgold) on Building AI Research Fleets · 2025-01-16T19:33:25.203Z · LW · GW

I would be very excited to see experiments with ABMs where the agents model fleets of research agents and tools. I expect in the near future we can build pipelines where the current fleet configuration - which should be defined in something like the terraform configuration language - automatically generates an ABM which is used for evaluation, control, and coordination experiments.

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2024-11-11T21:20:27.516Z · LW · GW
  • Cumulative Y2K readiness spending was approximately $100 billion, or about $365 per U.S. resident.
  • Y2K spending started as early 1995, and appears t peaked in 1998 and 1999 at about $30 billion per year.

https://www.commerce.gov/sites/default/files/migrated/reports/y2k_1.pdf

Comment by Ben Goldhaber (bgold) on Provably Safe AI: Worldview and Projects · 2024-09-04T17:40:11.009Z · LW · GW

Ah gotcha, yes lets do my $1k against your $10k.

Comment by Ben Goldhaber (bgold) on Provably Safe AI: Worldview and Projects · 2024-09-03T17:14:17.398Z · LW · GW

Given your rationale I'm onboard for 3 or more consistent physical instances of the lock have been manufactured. 

Lets 'lock' it in. 

Comment by Ben Goldhaber (bgold) on Provably Safe AI: Worldview and Projects · 2024-08-21T23:50:31.213Z · LW · GW

@Raemon works for me; and I agree with the other conditions.

Comment by Ben Goldhaber (bgold) on Provably Safe AI: Worldview and Projects · 2024-08-19T04:12:12.436Z · LW · GW

This seems mostly good to me, thank you for the proposals (and sorry for my delayed response, this slipped my mind).

OR less than three consistent physical instances have been manufactured. (e.g. a total of three including prototypes or other designs doesn't count) 

Why this condition? It doesn't seem relevant to the core contention, and if someone prototyped a single lock using a GS AI approach but didn't figure out how to manufacture it at scale, I'd still consider it to have been an important experiment.

Besides that, I'd agree to the above conditions!

Comment by Ben Goldhaber (bgold) on Provably Safe AI: Worldview and Projects · 2024-08-12T15:16:31.514Z · LW · GW
  • (8) won't be attempted, or will fail at some combination of design, manufacture, or just-being-pickable.  This is a great proposal and a beautifully compact crux for the overall approach. 

I agree with you that this feels like a 'compact crux' for many parts of the agenda. I'd like to take your bet, let me reflect if there's any additional operationalizations or conditioning.

However, I believe that the path there is to extend and complement current techniques, including empirical and experimental approaches alongside formal verification - whatever actually works in practice.

FWIW in Towards Guaranteed Safe AI I we endorse this: "Moreover, while we have argued for the need for verifiable quantitative safety guarantees, it is important to note that GS AI may not be the only route to achieving such guarantees.  An alternative approach might be to extract interpretable
policies from black-box algorithms via automated mechanistic interpretability... it is ultimately an empirical question whether it is easier to create interpretable world models or interpretable policies in a given domain of operation."

Comment by Ben Goldhaber (bgold) on Ryan Kidd's Shortform · 2024-07-13T00:13:34.413Z · LW · GW

I agree with this, I'd like to see AI Safety scale with new projects. A few ideas I've been mulling:

- A 'festival week' bringing entrepreneur types and AI safety types together to cowork from the same place, along with a few talks and lot of mixers.
- running an incubator/accelerator program at the tail end of a funding round, with fiscal sponsorship and some amount of operational support. 
- more targeted recruitment for specific projects to advance important parts of a research agenda.

 

It's often unclear to me whether new projects should actually be new organizations; making it easier to spin up new projects, that can then either join existing orgs or grow into orgs themselves, seems like a promising direction.

Comment by Ben Goldhaber (bgold) on Davidad's Bold Plan for Alignment: An In-Depth Explanation · 2023-04-27T14:17:33.934Z · LW · GW

First off thank you for writing this, great explanation.

  • Do you anticipate acceleration risks from developing the formal models through an open, multilateral process? Presumably others could use the models to train and advance the capabilities of their own RL agents. Or is the expectation that regulation would accompany this such that only the consortium could use the world model?
  • Would the simulations be exclusively for 'hard science' domains - ex. chemistry, biology - or would simulations of human behavior,  economics, and politics also be needed? My expectation is that it would need the latter, but I imagine simulating hundreds of millions of intelligent agents would dramatically (prohibitively?) increase the complexity and computational costs.
Comment by Ben Goldhaber (bgold) on Protectionism will Slow the Deployment of AI · 2023-01-08T18:11:56.970Z · LW · GW

This seems like an important crux to me, because I don't think greatly slowing AI in the US would require new federal laws. I think many of the actions I listed could be taken by government agencies who over-interpret their existing mandates given the right political and social climate. For instance, the eviction moratorium during COVID, obviously should have required congressional action, but was done by fiat through an over-interpretation of authority by an executive branch agency. 

What they do or do not do seems mostly dictated by that socio-political climate, and by the courts, which means less veto points for industry.

Comment by Ben Goldhaber (bgold) on Protectionism will Slow the Deployment of AI · 2023-01-08T15:00:13.110Z · LW · GW

I agree that competition with China is a plausible reason regulation won't happen; that will certainly be one of the arguments advanced by industry and NatSec as to why it should not be throttled. However, I'm not sure, and currently don't think it will, be stronger than the protectionist impulses,. Possibly it will exacerbate the "centralization" of AI dynamic that I listed in the 'licensing' bullet point, where large existing players receive money and de-facto license to operate in certain areas and then avoid others (as memeticimagery points out). So for instance we see more military style research, and GooAmBookSoft tacitly agree to not deploy AI that would replace lawyers.

 

To your point on big tech's political influence; they have, in some absolute sense, a lot of political power, but relatively they are much weaker in political influence than peer industries. I think they've benefitted a lot from the R-D stalemate in DC; I'm positing that this will go around/through this stalemate, and I don't think they currently have the softpower to stop that.

Comment by Ben Goldhaber (bgold) on What price would you pay for the RadVac Vaccine and why? · 2021-02-06T02:04:28.847Z · LW · GW

hah yes - seeing that great post from johnwentsworth inspired me to review my own thinking on RadVac. Ultimately I placed a lower estimate on RadVac being effective - or at least effective enough to get me to change my quarantine behavior - such that the price wasn't worth it, but I think I get a rationality demerit for not investing more in the collaborative model building (and collaborative purchasing) part of the process.

Comment by Ben Goldhaber (bgold) on What price would you pay for the RadVac Vaccine and why? · 2021-02-04T17:42:45.558Z · LW · GW

I'm sorry I didn't see this response until now - thank you for the detailed answer!

Comment by Ben Goldhaber (bgold) on Player vs. Character: A Two-Level Model of Ethics · 2020-10-01T18:32:18.593Z · LW · GW

I'm guessing your concern feels similar to ones you've articulated in the past around... "heart"/"grounded" rationality, or a concern about "disabling pieces of the epistemic immune system". 

I'm curious if 8 mo's later you feel you can better speak to what you see as the crucial misunderstanding?

Comment by Ben Goldhaber (bgold) on Thiel on Progress and Stagnation · 2020-08-18T22:43:28.785Z · LW · GW

Out of curiosity what's one of your more substantive disagreements with Thiel?

Comment by Ben Goldhaber (bgold) on RT-LAMP is the right way to scale diagnostic testing for the coronavirus · 2020-08-04T21:27:14.828Z · LW · GW

I'd be quite interested in reading that guide!

Comment by Ben Goldhaber (bgold) on Competition: Amplify Rohin’s Prediction on AGI researchers & Safety Concerns · 2020-08-01T02:37:02.705Z · LW · GW

Forecast - 25 mins

  • I thought it was more likely that in the short run there could be a preference cascade among top AGI researchers, and as others have mentioned due to the operationalization of top AGI researchers might be true already.
  • If this doesn't become a majority concern by 2050, I expect it will be because of another AI Winter, and I tried to have my distribution reflect that (a little hamfistedly).
Comment by Ben Goldhaber (bgold) on Rereading Atlas Shrugged · 2020-07-29T03:11:22.616Z · LW · GW

Thanks for posting this. I recently reread the Fountainhead, which I similarly enjoyed and got more out of than did my teenage self - it was like a narrative, emotional portrayal of the ideals in Marc Andreessen's It's Time to Build essay.

I interpreted your section on The Conflict as the choice between voice and exit.

Comment by Ben Goldhaber (bgold) on Solving Math Problems by Relay · 2020-07-18T00:07:49.706Z · LW · GW

The larger scientific question was related to Factored Cognition, and getting a sense of the difficulty of solving problems through this type of "collaborative crowdsourcing". The hope was running this experiment would lead to insights that could then inform the direction of future experiments, in the way that you might fingertip feel your way around an unknown space to get a handle on where to go next. For example if it turned out to be easy for groups to execute this type of problem solving, we might push ahead with competitions between teams to develop the best strategies for context-free problem solving.

In that regard it didn't turn out to be particularly informative, because it wasn't easy for the groups to solve the math problems, and it's unclear if that's because of the problems selected, the team compositions, the software, etc. So re: the larger scientific question I don't think there's much to conclude.

But personally I felt that by watching relay participants I gained a lot of UX intuitions around what type of software design and strategy design is necessary for factored strategies - what I broadly think of as problem solving strategies that rely upon decomposition - to work. Two that immediately come to mind:

  • Create software design patterns that allow the user to hide/reveal information in intuitive ways. It was difficult, when thrown into a huge problem doc with little context, to know where to focus. I wanted a way for the previous user to only show me the info I needed. For example, the way workflow-y / Roam Research bullet points allow you to hide unneeded details, and how if you click on a bullet point you're brought into an entirely new context.
  • When designing strategies try focusing on the return signature: When coming up with new strategies for solving relay problems, at first it was entirely free form. I as a user would jump in, try pushing the problem as far as I could, and leave haphazard notes in the doc. Over time we developed more complex shorthand and shared strategies for solving a problem. One heuristic I now use when developing strategies for problem solving that use decomposition is to prioritizing thinking about what each sub part of the strategy will return to the top caller. That clarifies the interface, simplifies what the person working on the sub strategy needs to do, and promotes composability.

These ideas are helpful because - I posit - we're faced with Relay Game like problems all the time. When I work on a project, leave it for a week, and come back, I think I'm engaging in a relay between past Ben, present Ben, and future Ben. Some of these ideas informed my design of templates for collaborative group forecasting.

Comment by Ben Goldhaber (bgold) on Solving Math Problems by Relay · 2020-07-17T23:08:18.775Z · LW · GW

Thanks, rewrote and tried to clarify. In essence the researchers were testing transmission of "strategies" for using a tool, where an individual was limited in what they could transmit to the next user, akin to this relay experiment.

In fact they found that trying to convey causal theories could undermine the next person's performance; they speculate that it reduced experimentation prematurely.

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2020-07-17T17:52:59.256Z · LW · GW

... my god...

Comment by Ben Goldhaber (bgold) on ESRogs's Shortform · 2020-07-04T15:36:27.767Z · LW · GW

Thanks for posting this. Why did you invest in those three startups in particular? Was it the market, the founders, personal connections? And was it a systematic search for startups to invest in, or more of an "opportunity-arose" situation?

Comment by Ben Goldhaber (bgold) on What are the best tools for recording predictions? · 2020-05-24T23:22:13.431Z · LW · GW

I know Ozzie has been thinking about this, because we were chatting about how to use an Alfred workflow to post to it. Which I think would be great!

Comment by Ben Goldhaber (bgold) on What are the best tools for recording predictions? · 2020-05-24T23:00:46.050Z · LW · GW

I've spent a fair bit of time in the forecasting space playing w/ different tools, and I never found one that I could reliably use for personal prediction tracking.

Ultimately for me it comes down to:

1.) Friction: the predictions I'm most interested in tracking are "5-second-level" predictions - "do I think this person is right", "is the fact that I have a cough and am tired a sign that I'm getting sick" etc. - and I need to be able to jot that down quickly.

2.) "Routine": There are certain sites that are toothbrush sites, aka I use them everyday. I'm much more likely to adopt a digital habit if I can use one of those sites to fulfill the function.

So my current workflow for private predictions is to use a textexpander snippet w/ Roam.

- [[Predictions]]
- {percentage}%
- [[operationalized]]:
- [[{date}]]
- {{[[TODO]]}} [[outcome]]:


It doesn't have graphs, but I can get a pretty good sense of how calibrated I am, and if I want I could quickly export the markdown and evaluate it.

Of course I want to mention foretold.io as another good site - if you want to distributions that's definitely the way to go.

Comment by Ben Goldhaber (bgold) on How likely is it that US states or cities will prevent travel across their borders? · 2020-03-14T19:40:52.129Z · LW · GW

The commerce clause gives the federal government broad powers to regulate interstate commerce, and in particular the the U.S. Secretary of Health and Human Services can exercise it to institute quarantine. https://cdc.gov/quarantine/aboutlawsregulationsquarantineisolation.html

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2020-03-01T20:53:50.899Z · LW · GW

Depression as a concept doesn't make sense to me. Why on earth would it be fitness enhancing to have a state of withdrawal, retreat, collapse where a lack of energy prevents you from trying new things? I've brainstormed a number of explanations:

    • depression as chemical imbalance: a hardware level failure has occurred, maybe randomly maybe because of an "overload" of sensation
    • depression as signaling: withdrawal and retreat from the world indicates a credible signal that I need help
    • depression as retreat: the environment has become dangerous and bad and I should withdraw from it until it changes.

I'm partial to the explanation offered by the Predictive Processing Model, that depression is an extreme form of low confidence. As SSC write:

imagine the world’s most unsuccessful entrepreneur. Every company they make flounders and dies. Every stock they pick crashes the next day. Their vacations always get rained-out, their dates always end up with the other person leaving halfway through and sticking them with the bill.
What if your job is advising this guy? If they’re thinking of starting a new company, your advice is “Be really careful – you should know it’ll probably go badly”.
if sadness were a way of saying “Things are going pretty badly, maybe be less confidence and don’t start any new projects”, that would be useful...
Depression isn’t normal sadness. But if normal sadness lowers neural confidence a little, maybe depression is the pathological result of biological processes that lower neural confidence.

But I still don't understand why the behaviors we often see with depression - isolation, lack of energy - are 'longterm adaptive'. If a particular policy isn't working, I'd expect to see more energy going into experimentation.

[TK. Unfinished because I accidentally clicked submit and haven't finished editing the full comment]

Comment by Ben Goldhaber (bgold) on How much delay do you generally have between having a good new idea and sharing that idea publicly online? · 2020-02-22T20:28:19.494Z · LW · GW

I rarely share ideas online (I'm working on that); when I do the ideas tend to be "small" observations or models, the type I can write out quickly and send. ~10mins - 1 day after I have it.

Comment by Ben Goldhaber (bgold) on What is Success in an Immoral Maze? · 2020-01-10T18:24:26.309Z · LW · GW

I've heard that Talking Heads song dozens of times and have never watched the video. I was missing out!

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2020-01-06T19:26:01.206Z · LW · GW

neat hadn't seen that thanks

Comment by Ben Goldhaber (bgold) on What were the biggest discoveries / innovations in AI and ML? · 2020-01-06T19:25:11.754Z · LW · GW

NeurIPS best paper awards will likely contain good leads.

Comment by Ben Goldhaber (bgold) on Circling as Cousin to Rationality · 2020-01-05T22:52:15.328Z · LW · GW

I expect understanding something more explicitly - such as yours and another persons boundaries - w/o some type of underlying concept of acceptance of that boundary can increase exploitability. I recently wrote a shortform post on the topic of legibility that describes some patterns I've noticed here.

I don't think on average Circling makes one more exploitable, but I expect it increases variance, making some people significantly more exploitable than they were before because previously invisible boundaries are now visible, and can thus be attacked (by others but more often by a different part of the same person).

And yeah it does seem similar to the valley of bad rationality; the valley of bad circling, where when you're in the valley you're focusing on a naive form of connection without discernment of the boundaries.

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2020-01-05T22:16:52.528Z · LW · GW
  • Yes And is an improv technique where you keep the energy in a scene alive by going w/ the other persons suggestion and adding more to it. "A: Wow is that your pet monkey? B: Yes and he's also my doctor!"
  • Yes And is generative (creates a lot of output), as opposed to Hmm No which is critical (distills output)
  • A lot of the Sequences is Hmm No
  • It's not that Hmm No is wrong, it's that it cuts off future paths down the Yes And thought-stream.
  • If there's a critical error at the beginning of a thought that will undermine everything else then it makes sense to Hmm No (we don't want to spend a bunch of energy on something that will be fundamentally unsound). But if the later parts of the thought stream are not closely dependent on the beginning, or if it's only part of the stream that gets cut off, then you've lost a lot of potential value that could've been generated by the Yes And.
  • In conversation yes and is much more fun, which might be why the Sequences are important as a corrective (yeah look it's not fun to remember about biases, but they exist and you should model/include them)
  • Write drunk, edit sober. Yes And drunk, Hmm No in the morning.
Comment by Ben Goldhaber (bgold) on [Part 1] Amplifying generalist research via forecasting – Models of impact and challenges · 2020-01-05T18:22:22.198Z · LW · GW

IMO the term "amplification" fits if the scheme results in a 1.) clear efficiency gain and 2.) it's scalable. This looks like (delivering equivalent results but at a lower cost OR providing better results for an equivalent cost. (cost == $$ & time)), AND (~ O(n) scaling costs).

For example if there was a group of people who could emulate [Researcher's] fact checking of 100 claims but do it at 10x speed, then that's an efficiency gain as we're doing the same work in less time. If we pump the number to 1000 claims and the fact checkers could still do it at 10x speed without additional overheard complexity, then it's also scalable. Contrast that with the standard method of hiring additional junior researchers to do the fact checking - I expect it to not be as scalable ("huh we've got all these employees now I guess we need an HR department and perf reviews and...:)

It does seem like a fuzzy distinction to me, and I am mildly concerned about overloading a term that already has an association w/ IDA.

Comment by Ben Goldhaber (bgold) on ozziegooen's Shortform · 2020-01-02T22:04:43.428Z · LW · GW

Is there not a distillation phase in forecasting? One model of the forecasting process is person A builds up there model, distills a complicated question into a high information/highly compressed datum, which can then be used by others. In my mind its:

Model -> Distill - > "amplify" (not sure if that's actually the right word)

I prefer the term scalable instead of proliferation for "can this group do it cost-effectively" as it's a similar concept to that in CS.

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2020-01-02T21:58:39.549Z · LW · GW

Thanks for including that link - seems right, and reminded me of Scott's old post Epistemic Learned Helplessness

The only difference between their presentation and mine is that I’m saying that for 99% of people, 99% of the time, taking ideas seriously is the wrong strategy

I kinda think this is true, and it's not clear to me from the outset whether you should "go down the path" of getting access to level 3 magic given the negatives.

Probably good heuristics are proceeding with caution when encountering new/out there ideas, remembering you always have the right to say no, finding trustworthy guides, etc.

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2020-01-02T00:38:43.389Z · LW · GW
  • Why do I not always have conscious access to my inner parts? Why, when speaking with authority figures, might I have a sudden sense of blankness.
  • Recently I've been thinking about this reaction in the frame of 'legibility', ala Seeing like a State. State's would impose organizational structures on societies that were easy to see and control - they made the society more legible - to the actors who ran the state, but these organizational structure were bad for the people in the society.
    • For example, census data, standardized weights and measures, and uniform languages make it easier to tax and control the population. [Wikipedia]
  • I'm toying with applying this concept across the stack.
    • If you have an existing model of people being made up of parts [Kaj's articles], I think there's a similar thing happening. I notice I'm angry but can't quite tell why or get a conceptual handle on it - if it were fully legible and accessible to conscious mind, then it would be much easier to apply pressure and control that 'part', regardless if the control I am exerting is good. So instead, it remains illegible.
    • A level up, in a small group conversation, I notice I feel missed, like I'm not being heard in fullness, but someone else directly asks me about my model and I draw a blank, like I can't access this model or share it. If my model were legible, someone else would get more access to it and be able to control it/point out its flaws. That might be good or it might be bad, but if it's illegible it can't be "coerced"/"mistaken" by others.
    • One more level up, I initially went down this track of thinking for a few reasons, one of which was wondering why prediction forecasting systems are so hard to adopt within organizations. Operationalization of terms is difficult and it's hard to get a precise enough question that everyone can agree on, but it's very 'unfun' to have uncertain terms (people are much more likely to not predict then predict with huge uncertainty). I think the legibility concept comes into play - I am reluctant to put a term out that is part of my model of the world and attach real points/weight to it because now there's this "legible leverage point" on me.
      • I hold this pretty loosely, but there's something here that rings true and is similar to an observation Robin Hanson made around why people seem to trust human decision makers more than hard standards.
  • This concept of personal legibility seems associated with the concept of bucket errors, in that theoretically sharing a model and acting on the model are distinct actions, except I expect often legibility concerns are highly warranted (things might be out to get you)
Comment by Ben Goldhaber (bgold) on 2020's Prediction Thread · 2019-12-30T23:20:59.297Z · LW · GW

I'd also encourage you to link your predictions to Foretold/Metaculus/other prediction aggregator questions, though only if you write your prediction in the thread as well to prevent link rot.

Comment by Ben Goldhaber (bgold) on AlphaStar: Impressive for RL progress, not for AGI progress · 2019-11-13T18:12:28.958Z · LW · GW

I watched all of the Grandmaster level games. When playing against grandmasters the average win rate of AlphaStar across all three races was 55.25%

  • Protoss Win Rate: 78.57%
  • Terran Win Rate: 33.33%
  • Zerg Win Rate: 53.85%

Detailed match by match scoring

While I don't think that it is truly "superhuman", it is definitely competitive against top players.


Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2019-10-23T19:22:34.889Z · LW · GW

https://twitter.com/esyudkowsky/status/910941417928777728

I remember seeing other claims/analysis of this but don't remember where

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2019-10-21T20:17:07.533Z · LW · GW

Is the clearest "win" of a LW meme the rise of the term "virtue signaling"? On the one hand I'm impressed w/ how dominant it has become in the discourse, on the other... maybe our comparative advantage is creating really sharp symmetric weapons...

Comment by Ben Goldhaber (bgold) on bgold's Shortform · 2019-10-17T22:18:11.972Z · LW · GW

I have a cold, which reminded me that I want fashionable face masks to catch on so that I can wear them all the time in cold-and-flu season without accruing weirdness points.

Comment by Ben Goldhaber (bgold) on Daniel Kokotajlo's Shortform · 2019-10-14T19:18:57.220Z · LW · GW

I'm interested, and I'd suggest using https://foretold.io for this

Comment by Ben Goldhaber (bgold) on Hazard's Shortform Feed · 2019-10-14T17:51:56.607Z · LW · GW

I'd like to see someone in this community write an extension / refinement of it to further {need-good-color-name}pill people into the LW memes that the "higher mind" is not fundamentally better than the "animal mind"