Posts
Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks
2024-12-11T13:37:24.177Z
Call for evaluators: Participate in the European AI Office workshop on general-purpose AI models and systemic risks
2024-11-27T02:54:16.263Z
Workshop Report: Why current benchmarks approaches are not sufficient for safety?
2024-11-26T17:20:47.453Z
Comments
Comment by
Tom DAVID (tom-david) on
Investing in Robust Safety Mechanisms is critical for reducing Systemic Risks ·
2024-12-13T09:21:51.699Z ·
LW ·
GW
- Your first two bullet points are very accurate; it would indeed be relevant to continue by addressing these points further.
- Finally, regarding your last bullet point, I agree. Currently, we do not know if it is possible to develop such safeguards, and even if it were, it would require time and further research. I fully agree that this should be made more explicit!!
Comment by
Tom DAVID (tom-david) on
A list of core AI safety problems and how I hope to solve them ·
2023-08-26T16:34:19.539Z ·
LW ·
GW
"Instead of building in a shutdown button, build in a shutdown timer."
-> Isn't that a form of corrigibility with an added constraint? I'm not sure what would prevent you from convincing humans that it's a bad thing to respect the timer, for example. Is it because we'll formally verify we avoid deception instance? It's not clear to me but maybe I've misunderstood.