"Taking AI Risk Seriously" (thoughts by Critch)
Great, let me throw together a reply to your questions in reverse order. I've had a long day and lack the energy to do the rigorous, concise write-up that I'd want to do. But please comment with specific questions/criticisms that I can look into later.
What is the thought process behind their approach?
RAISE (copy-paste from slightly-promotional-looking wiki):
AI safety is a small field. It has only about 50 researchers. The field is mostly talent-constrained. Given the dangers of an uncontrolled intelligence explosion, increasing the amount of AIS researchers is crucial for the long-term survival of humanity.
Within the LW community there are plenty of talented people that bear a sense of urgency about AI. They are willing to switch careers to doing research, but they are unable to get there. This is understandable: the path up to research-level understanding is lonely, arduous, long, and uncertain. It is like a pilgrimage. One has to study concepts from the papers in which they first appeared. This is not easy. Such papers are undistilled. Unless one is lucky, there is no one to provide guidance and answer questions. Then should one come out on top, there is no guarantee that the quality of their work will be sufficient for a paycheck or a useful contribution.
The field of AI safety is in an innovator phase. Innovators are highly risk-tolerant and have a large amount of agency, which allows them to survive an environment with little guidance or supporting infrastructure. Let community organisers not fall for the typical mind fallacy, expecting risk-averse people to move into AI safety all by themselves. Unless one is particularly risk-tolerant or has a perfect safety net, they will not be able to fully take the plunge. Plenty of measures can be made to make getting into AI safety more like an "It's a small world"-ride:
- Let there be a tested path with signposts along the way to make progress clear and measurable.
- Let there be social reinforcement so that we are not hindered but helped by our instinct for conformity.
- Let there be high-quality explanations of the material to speed up and ease the learning process, so that it is cheap.
AI Safety Camp (copy-paste from our proposal, which will be posted on LW soon):
Aim: Efficiently launch aspiring AI safety and strategy researchers into concrete productivity by creating an ‘on-ramp’ for future researchers.
- Get people started on and immersed into concrete research work intended to lead to papers for publication.
- Address the bottleneck in AI safety/strategy of few experts being available to train or organize aspiring researchers by efficiently using expert time.
- Create a clear path from ‘interested/concerned’ to ‘active researcher’.
- Test a new method for bootstrapping talent-constrained research fields.
Method: Run an online research group culminating in a two week intensive in-person research camp.
(our plans is test our approach in Gran Canaria on 12 April, for which we're taking in applications right now, and based on our refinements, organise a July camp at the planned EA Hotel in the UK)
What material do these groups cover?
RAISE (from the top of my head)
The study group has finished writing video scripts on the first corrigibility unit for the online course. It has now split into two to work on the second unit:
- group A is learning about reinforcement learning using this book
- group B is writing video scripts on inverse reinforcement learning
Robert Miles is also starting to make the first video of the first corrigibility unit (we've allowed ourselves to get delayed too much in actually publishing and testing material IMO). Past videos we've experimented with include a lecture by Johannes Treutin from FRI and Rupert McCallum giving lectures on corrigibility.
AI Safety Camp (copy-paste from proposal)
Participants will work in groups on tightly-defined research projects on the following topics:
- Agent foundations
- Machine learning safety
- Policy & strategy
- Human values
Projects will be proposed by participants prior to the start of the program. Expert advisors from AI Safety/Strategy organisations will help refine them into proposals that are tractable, suitable for this research environment, and answer currently unsolved research questions. This allows for time-efficient use of advisors’ domain knowledge and research experience, and ensures that research is well-aligned with current priorities.
Participants will then split into groups to work on these research questions in online collaborative groups over a period of several months. This period will culminate in a two week in-person research camp aimed at turning this exploratory research into first drafts of publishable research papers. This will also allow for cross-disciplinary conversations and community building. Following the two week camp, advisors will give feedback on manuscripts, guiding first drafts towards completion and advising on next steps for researchers.
Who's running them and what's their background?
Our two core teams mostly consist of young European researchers/autodidacts who haven't published much on AI safety yet (which does risk us not knowing enough about the outcomes we're trying to design for others).
RAISE (from the top of my head):
Toon Alfrink (founder, coordinator): AI bachelor student, also organises LessWrong meetups in Amsterdam.
Robert Miles (video maker): Runs a relatively well-known YouTube channel advocating careully for AI safety.
Veerle de Goederen (oversees preqs study group): Finished a Biology bachelor (and has been our most reliable team member)
Johannes Heidecke (oversees the advanced study group): Master student, researching inverse reinforcement learning in Spain.
Remmelt Ellen (planning coordinator): see below.
AI Safety Camp (copy-paste from proposal)
Remmelt Ellen Remmelt is the Operations Manager of Effective Altruism Netherlands, where he coordinates national events, supports organisers of new meetups and takes care of mundane admin work. He also oversees planning for the team at RAISE, an online AI Safety course. He is a Bachelor intern at the Intelligent & Autonomous Systems research group. In his spare time, he’s exploring how to improve the interactions within multi-layered networks of agents to reach shared goals – especially approaches to collaboration within the EA community and the representation of persons and interest groups by negotiation agents in sub-exponential takeoff scenarios.
Tom McGrath Tom is a maths PhD student in the Systems and Signals group at Imperial College, where he works on statistical models of animal behaviour and physical models of inference. He will be interning at the Future of Humanity Institute from Jan 2018, working with Owain Evans. His previous organisational experience includes co-running Imperial’s Maths Helpdesk and running a postgraduate deep learning study group.
Linda Linsefors Linda has a PhD in theoretical physics, which she obtained at Université Grenoble Alpes for work on loop quantum gravity. Since then she has studied AI and AI Safety online for about a year. Linda is currently working at Integrated Science Lab in Umeå, Sweden, developing tools for analysing information flow in networks. She hopes to be able to work full time on AI Safety in the near future.
Nandi Schoots Nandi did a research master in pure mathematics and a minor in psychology at Leiden University. Her master was focused on algebraic geometry and her thesis was in category theory. Since graduating she has been steering her career in the direction of AI safety. She is currently employed as a data scientist in the Netherlands. In parallel to her work she is part of a study group on AI safety and involved with the reinforcement learning section of RAISE.
David Kristoffersson David has a background as R&D Project Manager at Ericsson where he led a project of 30 experienced software engineers developing many-core software development tools. He liaised with five internal stakeholder organisations, worked out strategy, made high-level technical decisions and coordinated a disparate set of subprojects spread over seven cities on two different continents. He has a further background as a Software Engineer and has a BS in Computer Engineering. In the past year, he has contracted for the Future of Humanity Institute, and has explored research projects in ML and AI strategy with FHI researchers.
Chris Pasek After graduating from mathematics and theoretical computer science, Chris ended up touring the world in search of meaning and self-improvement, and finally settled on working as a freelance researcher focused on AI alignment. Currently also running a rationalist shared housing project on the tropical island of Gran Canaria and continuing to look for ways to gradually self-modify in the direction of a superhuman FDT-consequentialist.
Mistake: I now realise that by not mentioning that I'm involved with both may resemble a conflict of interest – I had removed 'projects I'm involved with' from my earlier comment before posting it to keep it concise.