Posts

Comments

Comment by lisas (hoehne) on Reframing the burden of proof: Companies should prove that models are safe (rather than expecting auditors to prove that models are dangerous) · 2023-05-09T01:19:11.211Z · LW · GW

That seems like an excellent angle to the issue - I agree that reference models and stakeholders' different attitudes towards them likely have a huge impact.  As such, the criticisms the FDA faces might indeed be an issue! (at least that's how I understand your comment); 

However, I'd carefully offer a bit of pushback on the aviation industry as an example, keeping in mind the difficult tradeoffs and diverging interests regulators will face in designing an approval process for AI systems. I think the problems that regulators will face are more similar to those of the FDA & policymakers (if you assume they are your audience) might be more comfortable with a model that can somewhat withstand these problems. 

Below my reasoning (with a bit of an overstatement/ political rhetoric e.g., "risking peoples live")

As you highlighted, FDA is facing substantial criticism for being too cautious, e.g., with the Covid Vaccine taking longer to approve than the UK. Not permitting a medicine that would have been comparatively safe and highly effective, i.e., a false negative, can mean that medicine could have had a profound positive impact on someone's life. And beyond the public interest, industry has quite some financial interests in getting these through too. In a similar vein, I expect that regulators will face quite some pushback when "slowing" innovation down, i.e. not approving a model. On the other side, being too fast in pushing drugs through the pipeline is also commonly criticized (e.g., the recent Alzheimer's drug approval as a false positive example). Even more so, losing its reputation as a trustworthy regulator has a lot of knock-on effects. (i.e., will people trust an FDA-approved vaccine in the future?).  As such, both being too cautious and being too aggressive have both potentially high costs to people's lives, striking the right balance is incredibly difficult.

The aviation industry also faces a tradeoff, but I would argue, one side is inherently "weaker" than the other (for lack of a better description). In case something bad happens, there are huge reputational costs to the regulator if they had invested "too little" into safety.  A false negative error, however, i.e., overestimating the level of caution required and demanding more safety than necessary, does not necessarily negatively impact the reputation of the regulator; there are more or less only economic costs.  And most people seem to be okay with high safety standards in aviation. In other words & simplified, "overinvesting" in safety comes at an economic cost, and "underinvesting" in safety comes at reputational costs to the regulator and potentially people's live. 

My guess is that the reputational risks (& competing goals) that AI regulators will face, in particular in regards to the false negatives, are similar to those of the FDA. They will either be too cautious/ interventionist /innovation-hampering or too aggressive, if not both.  Aviation safety is (in my perception) rarely seen as too cautious (or at least nothing that get's routinely criticised by the public). 

Policy-makers - especially those currently "battling big tech" - are quite well aware of these tradeoffs they will face and the breath of stakeholders involved. As such, using an example that can withstand the reputational costs of applying too much caution might be a bit more powerful in some cases. In a similar vein, the FDA model is much more probed regarding capture (not getting one drug approved is incredibly costly for a single firm not for the whole industry, while industry-wide costs from safety restrictions in aviation can be passed on to consumers). 

Nonetheless, I completely understand the concern that "we might not make too many friends,", particularly among those focused on typical "pro-innovation considerations" or industry interests and that it makes sense to use this example with some caution.