peter_hurford

What do you think of the counterargument that OpenAI announced o3 in December and publicly solicited external safety testing then, and isn't deploying until ~4 months later?

Comment by Peter Wildeford (peter_hurford) on Anthropic’s Recommendations to OSTP for the U.S. AI Action Plan · 2025-03-07T02:18:18.942Z · LW · GW

Here's my summary of the recommendations:

National security testing
- Develop robust government capabilities to evaluate AI models (foreign and domestic) for security risks
- Once ASL-3 is reached, government should mandate pre-deployment testing
- Preserve the AI Safety Institute in the Department of Commerce to advance third-party testing
- Direct NIST to develop comprehensive national security evaluations in partnership with frontier AI developers
- Build classified and unclassified computing infrastructure for testing powerful AI systems
- Assemble interdisciplinary teams with both technical AI and national security expertise
Export Control Enhancement
- Tighten semiconductor export restrictions to prevent adversaries from accessing critical AI infrastructure
- Control H20 chips
- Require government-to-government agreements for countries hosting large chip deployments
  - As a prerequisite for hosting data centers with more than 50,000 chips from U.S. companies, the U.S. should mandate that countries at high-risk for chip smuggling comply with a government-to-government agreement that 1) requires them to align their export control systems with the U.S., 2) takes security measures to address chip smuggling to China, and 3) stops their companies from working with the Chinese military. The “Diffusion Rule” already contains the possibility for such agreements, laying a foundation for further policy development.
- Review and reduce the 1,700 H100 no-license required threshold for Tier 2 countries
  - Currently, the Diffusion Rule allows advanced chip orders from Tier 2 countries for less than 1,700 H100s —an approximately $40 million order—to proceed without review. These orders do not count against the Rule’s caps, regardless of the purchaser. While these thresholds address legitimate commercial purposes, we believe that they also pose smuggling risks. We recommend that the Administration consider reducing the number of H100s that Tier 2 countries can purchase without review to further mitigate smuggling risks.
- Increase funding for Bureau of Industry and Security (BIS) for export enforcement
Lab Security Improvements
- Establish classified and unclassified communication channels between AI labs and intelligence agencies for threat intelligence sharing, similar to Information Sharing and Analysis Centers used in critical infrastructure sectors
- Create systematic collaboration between frontier AI companies and intelligence agencies, including Five Eyes partners
- Elevate collection and analysis of adversarial AI development to a top intelligence priority, as to provide strategic warning and support export controls
- Expedite security clearances for AI industry professionals
- Direct NIST to develop next-generation security standards for AI training/inference clusters
- Develop confidential computing technologies that protect model weights even during processing
- Develop meaningful incentives for implementing enhanced security measures via procurement requirements for systems supporting federal government deployments.
- Direct DOE/DNI to conduct a study on advanced security requirements that may become appropriate to ensure sufficient control over and security of highly agentic models

Energy Infrastructure Scaling
- Set an ambitious national target: build 50 additional gigawatts of power dedicated to AI by 2027
- Streamline permitting processes for energy projects by accelerating reviews and enforcing timelines
- Expedite transmission line approvals to connect new energy sources to data centers
- Work with state/local governments to reduce permitting burdens
- Leverage federal real estate for co-locating power generation and next-gen data centers

Government AI Adoption
- across the whole of government, the Administration should systematically identify every instance where federal employees process text, images, audio, or video data, and augment these workflows with appropriate AI systems.
- Task OMB to address resource constraints and procurement limitations for AI adoption
- Eliminate regulatory and procedural barriers to rapid AI deployment across agencies
- Direct DoD and Intelligence Community to accelerate AI research, development and procurement
- Target largest civilian programs for AI implementation (IRS tax processing, VA healthcare delivery, etc.)

Economic Impact Monitoring
- Enhance data collection mechanisms to track AI adoption patterns and economic implications
- The Census Bureau’s American Time Use Survey should incorporate specific questions about AI usage, distinguishing between personal and professional applications while gathering detailed information about task types and systems employed.
- Update Census Bureau surveys to gather detailed information on AI usage and impacts
- Collect more granular data on tasks performed by workers to create a baseline for monitoring changes
- Track the relationship between AI computation investments and economic performance
- Examine how AI adoption might reshape the tax base and cause structural economic shifts

Comment by Peter Wildeford (peter_hurford) on Peter Wildeford's Shortform · 2025-03-05T20:29:48.315Z · LW · GW

If you've liked my writing in the past, I wanted to share that I've started a Substack: https://peterwildeford.substack.com/

Ever wanted a top forecaster to help you navigate the news? Want to know the latest in AI? I'm doing all that in my Substack -- forecast-driven analysis about AI, national security, innovation, and emerging technology!

Comment by Peter Wildeford (peter_hurford) on Chris_Leong's Shortform · 2025-02-05T10:45:56.114Z · LW · GW

Here you go: https://chatgpt.com/share/67a34222-e7d8-800d-9a86-357defc15a1d

Comment by Peter Wildeford (peter_hurford) on Chris_Leong's Shortform · 2025-02-04T18:48:05.691Z · LW · GW

My current working take is that it is at the level of a median-but-dedicated undergraduate of a top university who is interested and enthusiastic about AI safety. But Deep Research can do in 10 minutes what would take that undergraduate about 20 hours.

Happy to try a prompt for you and see what you think.

Comment by Peter Wildeford (peter_hurford) on A list of all the deadlines in Biden's Executive Order on AI · 2023-11-01T18:20:31.502Z · LW · GW

This was very helpful for you to put together - thank you!

Comment by Peter Wildeford (peter_hurford) on Alignment Grantmaking is Funding-Limited Right Now · 2023-07-28T18:12:35.464Z · LW · GW

I'd also add that historically I believe about two-thirds of LTFF's money has also come from OpenPhil, so LTFF doesn't represent a fully independent funder (though the decisionmaking around grants is pretty independent).

Comment by Peter Wildeford (peter_hurford) on New survey: 46% of Americans are concerned about extinction from AI; 69% support a six-month pause in AI development · 2023-04-05T13:28:18.367Z · LW · GW

This is a great poll and YouGov is a highly reputable pollster, but there is a significant caveat to note about the pause finding.

The way the question is framed provides information about "1000 technology leaders" who have signed a letter in favor of the pause but does not mention any opposition to the pause. I think this would push respondents to favor the pause. Ideal question construction would present more neutrally with both support and oppose statements.

Comment by Peter Wildeford (peter_hurford) on Can GPT-3 Write Contra Dances? · 2023-03-26T13:21:32.405Z · LW · GW

Good to know you won't be out of a job just yet!

Comment by Peter Wildeford (peter_hurford) on Can GPT-3 Write Contra Dances? · 2023-03-25T22:43:24.401Z · LW · GW

I asked GPT4 the same prompt and here is what it said:

Of course! Here's a new contra dance for you:
Contra Dance
Whirlwind Delight
By ChatGPT
(Duple improper)
A1

(8) Circle left 3/4

(8) Neighbor swing

A2

(8) Long lines forward and back

(8) Ladies chain

B1

(8) Star right 1x

(8) Gents aleman left 1.5

B2

(8) Half hey, ladies pass right shoulders to start

(8) Partner balance and swing, end facing down in a line of four

Enjoy dancing Whirlwind Delight!

How did it do?

Comment by Peter Wildeford (peter_hurford) on What's the Least Impressive Thing GPT-4 Won't be Able to Do · 2023-03-15T13:27:02.756Z · LW · GW

Yep! I was wrong and this is false!

Comment by Peter Wildeford (peter_hurford) on An AI risk argument that resonates with NYTimes readers · 2023-03-13T15:28:53.391Z · LW · GW

If we want to know what arguments resonate with New York Times articles we can actually use surveys, message testing, and focus groups to check and we don't need to guess! (Disclaimer: My company sells these services.)

Comment by Peter Wildeford (peter_hurford) on Let’s think about slowing down AI · 2022-12-27T17:21:30.076Z · LW · GW

Cool - I'll follow up when I'm back at work.

Comment by Peter Wildeford (peter_hurford) on Let’s think about slowing down AI · 2022-12-26T15:14:32.616Z · LW · GW

That makes a lot of sense. We can definitely test a lot of different framings. I think the problem with a lot of these kinds of problems is that they are low saliency, and thus people tend not to have opinions already, and thus they tend to generate an opinion on the spot. We have a lot of experience polling on low saliency issues though because we've done a lot of polling on animal farming policy which has similar framing effects.

Comment by Peter Wildeford (peter_hurford) on Let’s think about slowing down AI · 2022-12-25T23:07:11.399Z · LW · GW

I'll shill here and say that Rethink Priorities is pretty good at running polls of the electorate if anyone wants to know what a representative sample of Americans think about a particular issue such as this one. No need to poll Uber drivers or Twitter when you can do the real thing!

Comment by Peter Wildeford (peter_hurford) on Vaguely interested in Effective Altruism? Please Take the Official 2022 EA Survey · 2022-12-17T16:55:58.945Z · LW · GW

Yeah, it came from a lawyer. The point being that if you confess to something bad, we may be legally required to repot that, so be careful.

Comment by Peter Wildeford (peter_hurford) on Vaguely interested in Effective Altruism? Please Take the Official 2022 EA Survey · 2022-12-17T16:55:07.497Z · LW · GW

Feel free to skip questions if you feel they aren't applicable to you.

Comment by Peter Wildeford (peter_hurford) on Do anthropic considerations undercut the evolution anchor from the Bio Anchors report? · 2022-10-01T22:51:45.317Z · LW · GW

Does the chance evolution got really lucky cancel out with the chance that evolution got really unlucky? So maybe this doesn't change the mean but does increase the variance?as for how much to increase the variance, maybe like an additional +/-1 OOM tacked on to the existing evolution anchor?

I'm kinda thinking there's like a 10% chance you'd have to increase it by 10x and a 10% chance you'd have to decrease it by 10x. But maybe I'm not thinking about this right?

Comment by Peter Wildeford (peter_hurford) on No, human brains are not (much) more efficient than computers · 2022-09-06T17:49:40.525Z · LW · GW

There are a lot of different ways you can talk about "efficiency" here. The main thing I am thinking about with regard to the key question "how much FLOP would we expect transformative AI to require?" is whether, when using a neural net anchor (not evolution) to add a 1-3 OOM penalty to FLOP needs due to 2022-AI systems being less sample efficient than humans (requiring more data to produce the same capabilities) and with this penalty decreasing over time given expected algorithmic progress. The next question would be how much more efficient potential AI (e.g., 2100-AI not 2022-AI) could be given fundamentals of silicon vs. neurons, so we might know how much algorithmic progress could affect this.

I think it is pretty clear right now that 2022-AI is less sample efficient than humans. I think other forms of efficiency (e.g., power efficiency, efficiency of SGD vs. evolution) are less relevant to this.

Comment by Peter Wildeford (peter_hurford) on What's the Least Impressive Thing GPT-4 Won't be Able to Do · 2022-08-23T02:49:13.209Z · LW · GW

Yeah ok 80%. I also do concede this is a very trivial thing, not like some "gotcha look at what stupid LMs can't do no AGI until 2400".

Comment by Peter Wildeford (peter_hurford) on What's the Least Impressive Thing GPT-4 Won't be Able to Do · 2022-08-22T23:44:42.033Z · LW · GW

This is admittedly pretty trivial but I am 90% sure that if you prompt GPT4 with "Q: What is today's date?" it will not answer correctly. I think something like this would literally be the least impressive thing that GPT4 won't be able to do.

Comment by Peter Wildeford (peter_hurford) on Refine's First Blog Post Day · 2022-08-14T19:01:10.419Z · LW · GW

Thanks!

Comment by Peter Wildeford (peter_hurford) on Refine's First Blog Post Day · 2022-08-14T17:51:37.603Z · LW · GW

Is it ironic that the link to "All the posts I will never write" goes to a 404 page?

Comment by Peter Wildeford (peter_hurford) on Paper: Teaching GPT3 to express uncertainty in words · 2022-06-01T03:29:50.649Z · LW · GW

Does it get better at Metaculus forecasting?

Comment by Peter Wildeford (peter_hurford) on Call For Distillers · 2022-04-06T17:35:13.035Z · LW · GW

This sounds like something that could be done as an organization creating a job for it, which could help with mentorship/connections/motivation/job security relative to expecting people to apply to EAIF/LTFF

My organization (Rethink Priorities) is currently hiring for research assistants and research fellows (among other roles) and some of their responsibilities will include distillation.

Comment by Peter Wildeford (peter_hurford) on Late 2021 MIRI Conversations: AMA / Discussion · 2022-03-03T05:36:29.078Z · LW · GW

These conversations are great and I really admire the transparency. It's really nice to see discussions that normally happen in private happen instead in public where everyone can reflect, give feedback, and improve their own thoughts. On the other hand, the combined conversations combined to a decent-sized novel - LW says 198,846 words! Is anyone considering investing heavily in summarizing the content for people to get involved without having to read all that content?

Comment by Peter Wildeford (peter_hurford) on My experience at and around MIRI and CFAR (inspired by Zoe Curzi's writeup of experiences at Leverage) · 2021-10-17T16:34:39.347Z · LW · GW

I don't recall the specific claim, just that EY's probability mass for the claim was in the 95-99% range. The person argued that because EY disagrees with some other thoughtful people on that question, he shouldn't have such confidence.

I think people conflate the very reasonable "I am not going to adopt your 95-99% range because other thoughtful people disagree and I have no particular reason to trust you massively more than I trust other people" with the different "the fact that other thoughtful people mean there's no way you could arrive at 95-99% confidence" which is false. I think thoughtful people disagreeing with you is decent evidence you are wrong but can still be outweighed.

Comment by Peter Wildeford (peter_hurford) on Petrov Day 2021: Mutually Assured Destruction? · 2021-09-27T15:37:42.034Z · LW · GW

So it looks like we survived? (Yay)

Comment by Peter Wildeford (peter_hurford) on Petrov Day 2021: Mutually Assured Destruction? · 2021-09-26T21:51:14.324Z · LW · GW

I will be on the lookout for false alarms.

Comment by Peter Wildeford (peter_hurford) on Petrov Day 2021: Mutually Assured Destruction? · 2021-09-26T16:14:42.143Z · LW · GW

I can see whether the site is down or not. Seems pretty clear.

Comment by Peter Wildeford (peter_hurford) on Petrov Day 2021: Mutually Assured Destruction? · 2021-09-26T15:13:51.677Z · LW · GW

Attention LessWrong - I am a chosen user of EA Forum and I have the codes needed to destroy LessWrong. I hereby make a no first use pledge and I will not enter my codes for any reason, even if asked to do so. I also hereby pledge to second strike - if the EA Forum is taken down, I will retaliate.

Comment by Peter Wildeford (peter_hurford) on What will GPT-4 be incapable of? · 2021-04-06T23:16:39.776Z · LW · GW

Seems like "the right prompt" is doing a lot of work here. How do we know if we have given it "the right prompt"?

Do you think GPT-4 could do my taxes?

Comment by Peter Wildeford (peter_hurford) on Why Hasn't Effective Altruism Grown Since 2015? · 2021-03-09T06:53:48.524Z · LW · GW

1.) I think the core problem is that honestly no one (except 80K) actually is investing significant effort on growing the EA community since 2015 (especially comparable to the pre-2015 effort and especially as a percentage of total EA resources)

2.) Some of these examples are suspect. The GiveWell numbers definitely look to be increasing beyond 2015, especially when OpenPhil's understandably constant fundraising is removed - and this increase in GiveWell seems to line up with GiveWell's increased investment in their outreach. The OpenPhil numbers also look just to be sensitive to a few dominant eight figure grants, which understandably are not annual events. (Also my understanding is that Open Phil is starting off slowly intentionally but will aim to ramp up significantly in the near future.)

3.) As I capture in "Is EA Growing? EA Growth Metrics for 2018", many relevant EA growth statistics have peaked after 2015 or haven't peaked yet.

4.) There are still a lot of ways EA is growing other than what is captured in these graphs. For example, I bet something like total budget of EA orgs has been growing a lot even since 2015.

5.) Contrary to the "EA is inert" hypothesis, EA Survey data has shown that many people have been "convinced" of EA. Furthermore, our general population surveys show that the vast majority of people (>95% of US university students) have still not heard of EA.

Comment by Peter Wildeford (peter_hurford) on Why Hasn't Effective Altruism Grown Since 2015? · 2021-03-09T06:44:13.027Z · LW · GW

FWIW I I put together "Is EA Growing? EA Growth Metrics for 2018" and I'm looking forward for doing 2019+2020 soon

Comment by Peter Wildeford (peter_hurford) on What gripes do you have with Mustachianism? · 2020-06-12T15:53:10.934Z · LW · GW

Mr. Money Mustache has a lot of really good advice that I find a lot of value from. However, I think Mr. Money Mustache underestimates the ease and impact of opportunities to grow income relative to cutting spending - especially if you're in (or can be in) a high-earning field like tech. Doubling your income will put you on a much faster path than cutting your spending a further 5%.

Comment by Peter Wildeford (peter_hurford) on What are the best tools for recording predictions? · 2020-05-25T15:21:08.491Z · LW · GW

PredictionBook is really great for lightweight, private predictions and does everything you're looking for. Metaculus is great for more fully-featured predicting and I believe also supports private questions, but may be a bit of overkill for your use case. A spreadsheet also seems more than sufficient, as others have mentioned.

Comment by Peter Wildeford (peter_hurford) on Peter's COVID Consolidated Brief - 29 Apr · 2020-04-29T19:48:37.564Z · LW · GW

Thanks. I'll definitely aim to produce them more quickly... this one got away from me.

Comment by Peter Wildeford (peter_hurford) on Coronavirus: the four levels of social distancing, and when and why we might transition between them · 2020-03-28T04:07:28.973Z · LW · GW

My understanding is that we also have and might in the future also spend a decent amount of time in a "level 2.5", where some but not all non-essential businesses are open (i.e., no groups larger than ten, restaurants are closed to dine-in, hair salons are open).

Comment by Peter Wildeford (peter_hurford) on SARS-CoV-2 pool-testing algorithm puzzle · 2020-03-21T13:56:09.382Z · LW · GW

A binary search strategy still could be more efficient, depending on the ratio of positives to negatives.

Comment by Peter Wildeford (peter_hurford) on SARS-CoV-2 pool-testing algorithm puzzle · 2020-03-21T04:38:09.360Z · LW · GW

What about binary search?

Comment by Peter Wildeford (peter_hurford) on Is the coronavirus the most important thing to be focusing on right now? · 2020-03-21T01:22:08.308Z · LW · GW

This is a good answer.

Comment by Peter Wildeford (peter_hurford) on Is the coronavirus the most important thing to be focusing on right now? · 2020-03-21T01:19:15.642Z · LW · GW

Not really an answer, but a statement and a question - I imagine this is literally the least neglected issue in the world right now. How much does that affect the calculus? How much should we defer to people with more domain expertise?

Comment by Peter Wildeford (peter_hurford) on What is your recommended statistics textbook for a beginner? · 2019-12-29T17:27:34.791Z · LW · GW

Introduction to Statistical Learning

Comment by Peter Wildeford (peter_hurford) on Understanding “Deep Double Descent” · 2019-12-06T14:58:49.376Z · LW · GW

This paper also seems helpful: https://arxiv.org/pdf/1812.11118.pdf

Comment by Peter Wildeford (peter_hurford) on Please Take the 2019 EA Survey! · 2019-10-18T19:53:28.592Z · LW · GW

Answered here: https://forum.effectivealtruism.org/posts/YAwLfgwhg7opp3rTp/please-take-the-2019-ea-survey#G8Hn64AEyh3uMY2SG

Comment by Peter Wildeford (peter_hurford) on Please Take the 2019 EA Survey! · 2019-10-15T16:53:16.234Z · LW · GW

The EA Survey is closing today! Please take! https://www.surveymonkey.co.uk/r/EAS2019LW

Comment by Peter Wildeford (peter_hurford) on Please Take the 2019 EA Survey! · 2019-10-01T01:43:13.622Z · LW · GW

Thanks!

Comment by Peter Wildeford (peter_hurford) on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-26T21:37:08.972Z · LW · GW

It could also be on the list of pros, depending on how one uses LW.

Comment by Peter Wildeford (peter_hurford) on Honoring Petrov Day on LessWrong, in 2019 · 2019-09-26T18:08:39.989Z · LW · GW

Are you offering to take donations in exchange for pressing the button or not pressing the button?

Comment by Peter Wildeford (peter_hurford) on The why and how of daily updates · 2019-05-06T03:47:14.209Z · LW · GW

What happens if you don't check off everything for the day?

User info

Posts

Comments