aisafety.info, the Table of Content
post by Charbel-Raphaël (charbel-raphael-segerie) · 2023-12-31T13:57:15.916Z · LW · GW · 1 commentsContents
🆕 New to AI safety? Start here. 📘 Introduction to AI Safety 🧠 Introduction to ML 🤖 Types of AI 🚀 Takeoff & Intelligence explosion 📅 Timelines ❗ Types of Risks 🔍 What would an AGI be able to do? 🌋 Technical source of unalignment 🎉 Current prosaic solutions 🗺️ Strategy 💭 Consciousness ❓ Not convinced? Explore the arguments. 🤨 Superintelligence is unlikely? 😌 Superintelligence won’t be a big change? ⚠️ Superintelligence won’t be risky? 🤔 Why not just? 🧐 Isn't the real concern… 📜 I have certain philosophical beliefs, so this is not an issue 🔍 Want to understand the research? Dive deeper. 💻 Prosaic alignment 📝 Agent foundation 🏛️ Governance 🔬 Research Organisations 🤝 Want to help with AI safety? Get involved! 📌 General 📢 Outreach 🧪 Research 🏛️ Governance 🛠️ Ops & Meta 💵 Help financially 📚 Other resources None 1 comment
Here is a list of Q&A from https://aisafety.info/ . When I discovered the site, I was impressed by the volume of material produced. However, the interface is optimized for beginners. The following table of contents is for individuals who wish to navigate the various sections more freely. It was constructed by clustering the Q&A into subtopics. I'm not involved with aisafety.info, I just want to increase the visibility of the content they produced by presenting it in a different way. They are also working on a new interface. This table can also be found https://aisafety.info/toc/.
🆕 New to AI safety? Start here.
📘 Introduction to AI Safety
- What is AI safety?
- Why would an AI do bad things?
- How powerful will a mature superintelligence be?
- Why is safety important for smarter-than-human AI?
- How likely is extinction from superintelligent AI?
- What are some introductions to AI safety?
🧠 Introduction to ML
🤖 Types of AI
- What is artificial intelligence (AI)?
- What is "narrow AI"? & What is artificial general intelligence (AGI)?
- What is tool AI? & What is an agent?
- What is "transformative AI"?
- What are the differences between AGI, transformative AI, and superintelligence?
- What is "superintelligence"?
- What is a shoggoth?
- What is "whole brain emulation"?
- What are brain-computer interfaces?
🚀 Takeoff & Intelligence explosion
- Takeoff
- Intelligence explosion
- What are the differences between a singularity, an intelligence explosion, and a hard takeoff?
📅 Timelines
- Expert surveys
- Is Compute and Scaling enough?
- From AGI to ASI
❗ Types of Risks
- What are accident and misuse risks?
- What are existential risks (x-risks)
- What are the main sources of AI existential risk?
- What are astronomical suffering risks (s-risks)?
- What about other risks from AI?
- How might things go wrong with AI even without an agentic superintelligence?
- How might an "intelligence explosion" be dangerous?
- Is large-scale automated AI persuasion and propaganda a serious concern?
- What is a “treacherous turn”
- What is mindcrime?
🔍 What would an AGI be able to do?
- What is intelligence?
- Why would intelligence lead to power?
- Basic capabilities
- Advanced capabilities
- Strategic implications
🌋 Technical source of unalignment
- Orthogonality thesis
- Specification Gaming
- Why might a maximizing AI cause bad outcomes?
- What is instrumental convergence?
- What is corrigibility?
- What is perverse instantiation?
- Is it possible to code into an AI to avoid all the ways a given task could go wrong, and would it be dangerous to try that?
- Can we constrain a goal-directed AI using specified rules?
- Goal Misgeneralization
- Outer and Inner alignment
🎉 Current prosaic solutions
- What is imitation learning? & What is behavioral cloning?
- What is reinforcement learning from human feedback (RLHF) & "Constitutional AI"?
- How might interpretability be helpful?
- How is red teaming used in AI alignment?
🗺️ Strategy
- How likely is it that governments will play a significant role? What role would be desirable, if any?
- What would a "warning shot" look like?
- What is an alignment tax?
- Might an aligned superintelligence force people to have better lives and change more quickly than they want?
- Win conditions
- What are the "win conditions" for AI alignment?
- If we solve alignment, are we sure of a good future?
- What are "pivotal acts"?
- What is the "long reflection"?
- What would a good future with AGI look like?
- What would a good solution to AI alignment look like?
- At a high level, what is the challenge of alignment that we must meet to secure a good future?
- Race dynamics
- All things considered
- Impact of AI Safety
💭 Consciousness
- Could AI have emotions?
- Are AIs conscious?
- Do AIs suffer?
- Could we tell the AI to do what's morally right?
- Is there a danger in anthropomorphizing AIs and trying to understand them in human terms?
❓ Not convinced? Explore the arguments.
🤨 Superintelligence is unlikely?
- Why should we prepare for human-level AI technology now rather than decades down the line when it’s closer?
- Might an "intelligence explosion" never occur?
- Wouldn't a superintelligence be slowed down by the need to do experiments in the physical world?
- Can an AI really be smarter than humans?
- Will AI be able to think faster than humans?
- How can an AGI be smarter than all of humanity?
😌 Superintelligence won’t be a big change?
- Won’t AI be just like us?
- Isn’t AI just a tool like any other? Won’t it just do what we tell it to?
- Do people seriously worry about existential risk from AI?
- Are corporations superintelligent?
- Isn't capitalism the real unaligned superintelligence?
⚠️ Superintelligence won’t be risky?
- Are there any detailed example stories of what unaligned AGI would look like?
- Any AI will be a computer program. Why wouldn't it just do what it's programmed to do?
- Aren't robots the real problem? How can AI cause harm if it has no ability to directly manipulate the physical world?
- Wouldn't AIs need to have a power-seeking drive to pose a serious risk?
- Won't humans be able to beat an unaligned AI since we have a huge advantage in numbers?
- Wouldn't a superintelligence be wise?
🤔 Why not just?
- Why can't we just turn the AI off if it starts to misbehave?
- Once we notice that a superintelligence is trying to take over the world, can’t we turn it off, or reprogram it?
- Why don't we just not build AGI if it's so dangerous?
- Why can't we just make a "child AI" and raise it?
- Why can’t we just use Asimov’s Three Laws of Robotics?
- Why can’t we just “put the AI in a box” so that it can’t influence the outside world?
- Can't we limit damage from AI systems in the same ways we limit damage from companies?
- Why is AI alignment a hard problem?
🧐 Isn't the real concern…
- Isn't the real concern misuse?
- Isn't the real concern technological unemployment?
- Isn't the real concern bias?
- Isn't the real concern autonomous weapons?
📜 I have certain philosophical beliefs, so this is not an issue
- If I only care about helping people alive today, does AI safety still matter?
- Why should someone who is religious worry about AI existential risk?
- Does the importance of AI risk depend on caring about transhumanist utopias?
- Wouldn't it be a good thing for humanity to die out?
- Is AI safety about systems becoming malevolent or conscious and turning on us?
- Isn’t it immoral to control and impose our values on AI?
- We’re going to merge with the machines so this will never be a problem, right?
- Aren't AI existential risk concerns just an example of Pascal's mugging?
🔍 Want to understand the research? Dive deeper.
💻 Prosaic alignment
- What is prosaic alignment?
- Would AI alignment be hard with deep learning?
- Scalable oversight
- What is AI Safety via Debate?
- What is adversarial training?
- How is the Alignment Research Center (ARC) trying to solve Eliciting Latent Knowledge (ELK)?
- What is "HCH"?
- What is Iterated Distillation and Amplification (IDA)?
- What is Eliciting Latent Knowledge (ELK)?
- What does the scheme Externalized Reasoning Oversight involve?
- Interpretability
- What is interpretability and what approaches are there?
- What is the difference between verifiability, interpretability, transparency, and explainability?
- What are polysemantic neurons?
- What is a "polytope" in a neural network?
- What is feature visualization?
- What is neural network modularity?
- What does generative visualization look like in reinforcement learning?
- Where can I learn about interpretability?
- Conceptual advances
- Brain like AGI
📝 Agent foundation
- What is "agent foundations"?
- Important concepts
- Decision theory
- Research directions
- What is "Do what I mean"?
- What are the power-seeking theorems?
- Can you give an AI a goal which involves “minimally impacting the world”?
- What is a "quantilizer"?
- Would it improve the safety of quantilizers to cut off the top few percent of the distribution?
- What is Infra-Bayesianism?
- What is "coherent extrapolated volition (CEV)"?
- What are the leading theories in moral philosophy and which of them might be technically the easiest to encode into an AI?
🏛️ Governance
- Would a slowdown in AI capabilities development decrease existential risk?
- Are there any AI alignment projects which governments could usefully put a very large amount of resources into?
- What is everyone working on in AI governance?
- What might an international treaty on the development of AGI look like?
- Is the UN concerned about existential risk from AI?
🔬 Research Organisations
- Overviews
- What approaches are AI alignment organizations working on?
- What is everyone working on in AI alignment?
- What are the main categories of technical alignment research?
- What are some AI alignment research agendas currently being pursued?
- What are the different AI Alignment / Safety organizations and academics researching?
- Briefly, what are the major AI safety organizations and academics working on?
- Prosaic
- Big labs
- Academic labs
- Other Orgs
- Agent Foundation
- Other
🤝 Want to help with AI safety? Get involved!
📌 General
- What actions can I take in under five minutes to contribute to the cause of AI safety?
- How and why should I form my own views about AI safety?
📢 Outreach
- How can I work on public AI safety outreach?
- What links are especially valuable to share on social media or other contexts?
- How can I work on AGI safety outreach in academia and among experts?
- How can I convince others and present the arguments well?
🧪 Research
- I want to work on AI alignment.
- 📚 Education and Career Path
- What master's thesis could I write about AI safety?
- What subjects should I study at university to prepare myself for alignment research?
- I want to take big steps to contribute to AI alignment (e.g. making it my career). What should I do?
- I would like to focus on AI alignment, but it might be best to prioritize improving my life situation first. What should I do?
- How can I work toward AI alignment as a software engineer?
- 📋 Guidance and Mentorship
- 🧪 Projects and Involvement
- I’d like to do experimental work (i.e. ML, coding) for AI alignment. What should I do?
- I want to help out AI alignment without necessarily making major life changes. What are some simple things I can do to contribute?
- How can I do conceptual, mathematical, or philosophical work on AI alignment?
- What are some exercises and projects I can try?
- How can I use a background in the social sciences to help with AI alignment?
- How can I do machine learning programming work to help with AI alignment?
- What should I do with my machine learning research idea for AI alignment?
- What should I do with my idea for helping with AI alignment?
🏛️ Governance
- What are some AI governance exercises and projects I can try?
- What are some helpful AI policy resources?
- How can I work on AI policy?
🛠️ Ops & Meta
- Where can I find people to talk to about AI alignment?
- How can I work on helping AI alignment researchers be more effective, e.g. as a coach?
- How can I work on assessing AI alignment projects and distributing grants?
- How can I do organizational or operations work around AI alignment?
💵 Help financially
- Would donating small amounts to AI safety organizations make any significant difference?
- I’m interested in providing significant financial support to AI alignment. How should I go about this?
📚 Other resources
- Where can I find videos about AI Safety?
- What training programs and courses are available for AGI safety?
- Where can I learn more about AI alignment?
- AI Safety Memes Wiki
- What are some good resources on AI alignment?
- What are some good podcasts about AI alignment?
- What are some good books about AGI safety?
- I’d like to get deeper into the AI alignment literature. Where should I look?
- How can I update my emotional state regarding the urgency of AI safety?
1 comments
Comments sorted by top scores.
comment by Martin Vlach (martin-vlach) · 2024-02-26T14:26:53.396Z · LW(p) · GW(p)
With a quick test, I find their chat interface prototype experience quite satisfying.