Posts
Comments
I agree it would be nice to have strong categories or formalism pinning down which future systems would be safe to open source, but it seems an asymmetry in expected evidence to treat a non-consensus on systems which don't exist yet as a pro-open-sourcing position. I think it's fair to say there is enough of a consensus that we don't know which future systems would be safe and so need more work to determine this before irreversible proliferation.
What would be more persuasive is some evidence that AI is relatively more useful for making bioweapons than it is for doing things in general.
I see little reason to use that comparison rather than "will [category of AI models under consideration] improve offense (in bioterrorism, say) relative to defense?"
I'm glad I encountered this post! This comment is a bit of a brain dump, not much time taken to edit: A month or so ago I attempted a first pass at designing a semi-cooperative board game and it ended up having some pretty similar elements. I got bogged down in a complicated-to-work-in mechanic that allows games to become arbitrarily large which I won't go in to. I'm not attached to owning any of the ideas I was trying to work in and would love to see somebody do something clever with them (especially if I get to play better games as a result).
The most interesting to me was to reduce the incentive to be maximally self-serving by implementing abilities, conditions, or other game mechanics which use a Rawlsian veil of ignorance/random dictatorship style partial/scaling 'reshuffling' of resources or positions. For example, if players are sufficiently disadvantaged by an equilibrium, they could (attempt to) acquire (possibly chaotic) opportunities to reshuffle resources or other circumstances in a scaling fashion. It seems amenable to games where players have influence or control over a common set of characters, where they have choices about how much information to give away based on which actions they choose (such as in war of whispers). Compatible with that is the ability to, over the course of a game, align one's win conditions with a particular character on the board or a broader coalition, with varying degrees of visibility of loyalties.
The flatness of most points wins suggests some degree of decoupling of "I win you lose", i.e. a two player game could result in any of WW, WL, LW, LL. Part of the excitement of games is frequently having an outcome be very close/otherwise uncertain up until the end, with the extremity of winning or losing typically being far more exciting than whether or not one gained additional points on a personal score on the last turn. This doesn't seem to necessarily trade off against higher dimensional goals like trying to satisfy a set of interesting general or role-specific conditions. Quite a few games have a lot of their fun down suboptimal pathways relative to gaining the most money or points, and that can be reflected in optional goals to satisfy, e.g. sewing chaos, cute opportunities to roleplay one's position through gameplay, stealing the most points from other players or otherwise getting their goat, achieving a very difficult board configuration, being able to satisfy rarely attainable goals, or otherwise achieving things that make the game possessive of an interesting story about how everything went down.
There are lots of interesting gameplay features that could come from including negotiation/game theoretic phenomena such as overlapping and changing coalitions, arbitration/mediation, binding precommitments or contracts with various means of enforcement, scaling pressures to cooperate such as veil of ignorance/random dictator sorts of reshuffling, hidden goals and opportunities/uncertain loyalties where people give away some amount of information pursuing those, and providing more granular opportunities for sanctions or more direct conflict.
The regulation is intended to encourage a stable equilibrium among labs that may willingly follow that regulation for profit-motivated reasons.
Extreme threat modeling doesn't suggest ruling out plans that fail against almighty adversaries, it suggests using security mindset: reduce unnecessary load-bearing assumptions in the story you tell about why your system is secure. The proposal is mostly relying on standard cryptographic assumptions, and doesn't seem likely to do worse in expectation than no regulation.
It is a subject matter expert.