What prevents SB-1047 from triggering on deep fake porn/voice cloning fraud?

post by ChristianKl · 2024-09-26T09:17:39.088Z · LW · GW · 7 comments

This is a question post.

Contents

  Answers
    14 RobertM
    2 RamblinDash
None
7 comments

Recently, there was a post on SB-1047 [LW · GW] and how it's quite mild regulation. I'm not expert on it and don't know how it works. 

In the comment section I was asking:

Why wouldn't deep fake porn or voice cloning technology to engage in fraud be powerful enough to materially contribute to critical harm?

There are cases of fraud that could do $500,000,000 in damages.

Given how juries decide about damages, a model that's used to create child porn for thousands of children could be argued to cause  $500,000,000 in damages as well. Especially when coupled with something like trying to extort the children.

I'm surprised that my comment didn't get any engagement where people explained how they think the law will handle those cases while at the same time my post got no karma votes.

I'd love to believe that the law is well thought out, and simply a good step for AI safety. At the same time, I also like having accurate beliefs about the effects of the law, so let me repeat my question here.

How does the law handle damage caused by deep fake porn or fraud with voice cloning?


 

Answers

answer by RobertM · 2024-09-27T03:20:38.434Z · LW(p) · GW(p)

Notwithstanding the tendentious assumption in the other comment thread that courts are maximally adversarial processes bent on on misreading legislation to achieve their perverted ends, I would bet that the relevant courts would not in fact rule that a bunch of deepfaked child porn counted as "Other grave harms to public safety and security that are of comparable severity to the harms described in subparagraphs (A) to (C), inclusive", where those other things are "CBRN > mass casualties", "cyberattack on critical infra", and "autonomous action > mass casualties".  Happy to take such a bet at 2:1 odds.

But there are some simpler reason that particular hypothetical fails:

  • Image models are just not nearly as expensive to train, so it's unlikely that they'd fall under the definition of a covered model to begin with.
  • Even if someone used a covered multimodal model, existing models can already do this.

See:

(2) “Critical harm” does not include any of the following:

(A) Harms caused or materially enabled by information that a covered model or covered model derivative outputs if the information is otherwise reasonably publicly accessible by an ordinary person from sources other than a covered model or covered model derivative.

comment by cfoster0 · 2024-09-27T03:59:20.155Z · LW(p) · GW(p)

I’m not sure if you intended the allusion to “the tendentious assumption in the other comment thread that courts are maximally adversarial processes bent on on misreading legislation to achieve their perverted ends”, but if it was aimed at the thread I commented on… what? IMO it is fair game to call out as false the claim that

It only counts if the $500m comes from "cyber attacks on critical infrastructure" or "with limited human oversight, intervention, or supervision....results in death, great bodily injury, property damage, or property loss."

even if deepfake harms wouldn’t fall under this condition. Local validity matters.


I agree with you that deepfake harms are unlikely to be direct triggers for the bill’s provisions, for similar reasons as you mentioned.

Replies from: T3t
comment by RobertM (T3t) · 2024-09-27T04:45:05.699Z · LW(p) · GW(p)

Not your particular comment on it, no.

comment by ChristianKl · 2024-09-27T10:14:04.483Z · LW(p) · GW(p)

Child porn is frequently used to justify all sorts of highly invasive privacy interventions. ChatGPT * seems to think it would be a public safety thread under Californian law. 

Existing models can do pictures but not video. A complex multimodal model might be able to do video porn.

Better models might produce deep fake audio with less data and at nearer to how the person actually speaks. 

There's also the question of whether deep fake porn or faked audio is "accessible information" in the sense of that paragraph (2) (A). That paragraph clearly absolves a model if you can read how to build a bomb in a textbook that's already existing.

ChatGPT * does seem to think that pictures and audio would fall under information but it's less clear to me when it comes to the word "accessible". 

 * I think ChatGPT has a much better understanding of Californian law than me, at the same time it might also be wrong and I'm happy to hear from someone with actual legal experience if ChatGPT interprets words wrong. 

Replies from: T3t
comment by RobertM (T3t) · 2024-09-27T18:48:37.348Z · LW(p) · GW(p)

reasonably publicly accessible by an ordinary person from sources other than a covered model or covered model derivative

Seems like it'd pretty obviously cover information generated by non-covered models that are routinely used by many ordinary people (as open source image models currently are).

As a sidenote, I think the law is unfortunately one of those pretty cursed domains where it's hard to be very confident of anything as a layman without doing a lot of your own research, and you can't even look at experts speaking publicly on the subject since they're often performing advocacy, rather than making unbiased predictions about outcomes.  You could try to hire a lawyer for such advice, but it seems to be pretty hard to find lawyers who are comfortable giving their clients quantitative (probabilistic) and conditional estimates.  Maybe this is better once you're hiring for e.g. general counsel of a large org, or maybe large tech company CEOs have to deal with the same headaches that we do.  Often your best option is to just get a basic understanding of how relevant parts of the legal system work, and then do a lot of research into e.g. relevant case law, and then sanity-check your reasoning and conclusions with an actual lawyer specialized in that domain.

Replies from: ChristianKl
comment by ChristianKl · 2024-09-29T21:57:39.772Z · LW(p) · GW(p)

Deep fake porn of a particular person is not information that's generated by non-covered models that are routinely used by many ordinary people even if the models could generate the porn if instructed to do so. 

Replies from: T3t
comment by RobertM (T3t) · 2024-09-30T01:47:48.420Z · LW(p) · GW(p)

Almost no specific (interesting) output is information that's already been generated by any model, in the strictest sense.

Replies from: ChristianKl
comment by ChristianKl · 2024-09-30T08:24:37.661Z · LW(p) · GW(p)

If I tell a model to write me a book summary, that book summary can be specific interesting output without containing any new information. 

If I want to know how to build a bomb, there are already plenty of sources out there on how to build a bomb. The information is already accessible from those sources. When an LLM synthesizes the existing information in its training data to help someone build a bomb it's not inventing new information. 

Deep fakes aren't about simply repeating information that's already in the training data. 

So the argument would be that the lawmaker chose to say "accessible" because they want to allow LLMs to synthesize the existing information in their training data and repeat it back to the user but that does not mean that the lawmaker had an intention to allow the LLMs to produce new information that gets used to create harm even if there are other ways to create that information. 

answer by RamblinDash · 2024-09-26T11:02:45.180Z · LW(p) · GW(p)

It only counts if the $500m comes from "cyber attacks on critical infrastructure" or "with limited human oversight, intervention, or supervision....results in death, great bodily injury, property damage, or property loss."

So emotional damages, even if severe and pervasive, can't get you there.

comment by cfoster0 · 2024-09-26T15:04:13.455Z · LW(p) · GW(p)

If you read the definition of critical harms, you’ll see the $500m doesn’t have to come in one of those two forms. It can also be “Other grave harms to public safety and security that are of comparable severity”.

Replies from: RamblinDash
comment by RamblinDash · 2024-09-26T16:54:50.601Z · LW(p) · GW(p)

I have a hard time imagining a Court ruling that "Other grave harms to public safety and security that are of comparable severity" could embrace something so different-in-kind than the listed items.

Replies from: shankar-sivarajan
comment by Shankar Sivarajan (shankar-sivarajan) · 2024-09-26T17:56:13.128Z · LW(p) · GW(p)

From the Fish and Game code: 

"Fish" means a wild fish, mollusk, crustacean, invertebrate, amphibian, or part, spawn, or ovum of any of those animals.

This was read to include bees for the purposes of the Endangered Species Act which lists 

"bird, mammal, fish, amphibian, reptile, or plant" 

(Emphases mine).

Replies from: RamblinDash
comment by RamblinDash · 2024-09-26T18:33:01.724Z · LW(p) · GW(p)

Well, "fish" is a statutorily defined term that clearly includes all invertebrates. What did you want the court to do, ignore the statutory text? Arguably, that outcome supports the notion that the courts are less likely to just ignore text limiting the kinds of harm that are cognizable, not more likely, as you seem to be arguing.

Replies from: ChristianKl
comment by ChristianKl · 2024-09-26T19:30:05.247Z · LW(p) · GW(p)

(4) uses three terms "public safety", "public security" and comparable severity. 

I would expect that severity means $500,000,000 worth of damage.

Public safety does not seem to be a clearly defined term but fairly broad. 

comment by ChristianKl · 2024-09-26T18:46:31.956Z · LW(p) · GW(p)

If someone creates an automated system that makes deep fake porn and then emails with that porn to blackmail people and publishes the deep fake porn when people don't pay up, that could very well be a system with limited human oversight, intervention, or supervision.

Those people who pay the blackmail would also suffer from property loss. 

If you have someone committing suicide because of deep fake porn images of themselves, it might also result in death.

If you have one suicide + $500,000,000 worth in emotional damage wouldn't it count?

7 comments

Comments sorted by top scores.

comment by Yair Halberstadt (yair-halberstadt) · 2024-09-26T15:30:39.210Z · LW(p) · GW(p)

I'm not sure why those shouldn't be included? If someone uses my AI to perform 500 million dollars of fraud, then I should probably have been more careful releasing the product.

Replies from: ChristianKl
comment by ChristianKl · 2024-09-26T16:06:05.302Z · LW(p) · GW(p)

leogao spoke of SB-1047 being quite mild.

If you include those things, I expect that you effectively outlaw Open Source AI models for video/audio generation.

I think it's a reasonable decision to outlaw Open Source AI models, but it doesn't sound "mild".

Replies from: yair-halberstadt
comment by Shankar Sivarajan (shankar-sivarajan) · 2024-09-26T15:37:43.916Z · LW(p) · GW(p)

[Comment removed to try to avoid getting rate-limited.] 

Replies from: Seth Herd
comment by Seth Herd · 2024-09-26T18:02:37.071Z · LW(p) · GW(p)

"Anyone who says otherwise is lying" is pretty judgmental and hostile. And wrong.

We really try to maintain a culture of assuming good intent on LW. The claim that nobody might legitimately misunderstand how the law would be enforced is quite a strong assumption about people's intelligence and the time they spend to understand things before commenting.

I think this phrasing is better suited to the broader internet where people start arguments instead of working together to understand the truth.

Replies from: shankar-sivarajan
comment by Shankar Sivarajan (shankar-sivarajan) · 2024-09-26T18:17:22.335Z · LW(p) · GW(p)

It is convention in fields outside of, say, math, for statements to be considered substantially true despite the possibility of exceedingly rare/implausible exceptions. And I accuse you of knowing this and making this isolated demand for my claim to be true without any conceivable exception (which I freely admit it isn't) because you don't like it.

working together to understand the truth.

Do you believe this to be a reasonable characterization of the discussion on LW about SB-1047?

Replies from: Seth Herd
comment by Seth Herd · 2024-09-26T18:28:41.401Z · LW(p) · GW(p)

I absolutely do think LW at large is trying to understand the truth about that bill, yes. I'm sure there are some exceptions, but I'd be surprised if there were many LWers willing to actively deceive people about its consequences - LWers typically really hate lying.

Your statement is not just technically incorrect, but mostly incorrect. Incompetence explains many more wrong statement than malice. I'm not going to change your mind on that right now, but it's something I think about an awful lot, and I think the evidence strongly supports it.

More importantly, your statement sounds mean, which is enough to not want it on LW. People being rude and not extending the benefit of the doubt leads to a community that argues instead of collaboratively seeking the truth. Arguing has huge but subtle downsides for reaching the truth - motivated reasoning from the combative relationships in arguments leads people to solidify their beliefs instead of changing them as the evidence suggests.

I believe the "about LW" page requests that we extend benefit of the doubt and be not just civil, but polite. If not, the community at large still does this, and it seems to work.

This has nothing to do with my support or lack of SBb-1047; I don't even know if I do support it, because I find such legislation's first-order effects to be pointless. My comment is merely about truth and how to obtain it. Being mean is not how you get at the truth.