Discussion: LLaMA Leak & Whistleblowing in pre-AGI era

post by jirahim · 2023-03-06T18:47:06.459Z · LW · GW · 4 comments

It was reported that Meta's LLaMA models were leaked, with someone adding a PR with the magnet link into their official repository.

Now the public has access to the model that is apparently as powerful or even more powerful as GPT-3 on most of the benchmarks.

Is this a good or a bad event for humanity?

Are powerful models better being kept behind closed doors used only by the corporations that had produced them, or does the public having access to it evens out the playing field, despite the potential misuse by bad actors?

Should this continue to happen, and should there be our own Snowdens in the AI field, whistleblowing if they notice something that is in the public interest to be known?

What if they work at Large Corporation X, and they believe the first AGI had been invented? Is it better for humanity that the AGI is solely used by that CEO for the next five years ( / the board of directors / the ultra-rich that are able to pay billions of dollars to that AI company for exclusive use), amassing as much power as possible until they monopolize not just the industry, but potentially the whole world, or is leaking the AGI weights to the public the lesser of two evils and is in fact a moral responsibility, where the whole humanity is upgraded to having AGI capabilities instead of one person or a small group of people?

Let's discuss.

4 comments

Comments sorted by top scores.

comment by baturinsky · 2023-03-06T19:13:07.781Z · LW(p) · GW(p)

LLaMA was leaked "by design". The way it was distributed, it would be impossible for it not to leak.

And yes, if a really powerful AI would leak, someone will make a Skynet with it in a hour just to see if they can. So, I would prefer for someone, anyone, would control it alone.

Replies from: Taleuntum
comment by Taleuntum · 2023-03-06T19:57:40.947Z · LW(p) · GW(p)

Agreed. I got the weights very quickly after filling out the form, even though I simply wrote "None" in the (required!) "Previous related publications" field. (It still felt great to get it, so thx Meta!)

Replies from: jirahim
comment by jirahim · 2023-03-07T06:58:43.145Z · LW(p) · GW(p)

Okay, so you believe this was a marketing stunt?

OPT-175B still did not leak, despite also being "accessible-by-request" only

comment by jchan · 2023-03-07T16:27:07.200Z · LW(p) · GW(p)

I'm unsure whether it's a good thing that LLaMA exists in the first place, but given that it does, it's probably better that it leak than that it remain private.

What are the possible bad consequences of inventing LLaMA-level LLMs? I can think of three. However, #1 and #2 are of a peculiar kind [LW · GW] where the downsides are actually mitigated rather than worsened by greater proliferation. I don't think #3 is a big concern at the moment, but this may change as LLM capabilities improve (and please correct me if I'm wrong in my impression of current capabilities).

  1. Economic disruption: LLMs may lead to unemployment because it's cheaper to use one than to hire a human to do the same work. However, given that they already exist, it's only a question of whether the economic gains accrue to a few large corporations or to a wider mass of people. If you think economic inequality is bad (whether per se or due to its consequences), then you'll think the LLaMA leak is a good thing.
  2. Informational chaos: You can never know whether product reviews, political opinions, etc. are actually genuine expressions of what some human being thinks rather than AI-generated fluff created by actors with an interest in deceiving you. This was already a problem (i.e. paid shills), but with LLMs it's much easier to generate disinformation at scale. However, this problem "solves itself" once LLMs are so easily accessible that everyone knows not to trust anything they read anyway. (By contrast, if LLMs are kept private, AI-generated content seems more trustworthy because it comes in a wider context where most content is still human-authored.)
  3. Infohazard production: If e.g. there's some way of building a devastating bioweapon using household materials, then it'd be really bad if LLaMA made this knowledge more accessible, or could discover it anew. However, I haven't seen any evidence that LLaMA is capable of discovering new scientific knowledge that's not in the training set, or that querying it to surface existing such knowledge is any more effective than using a regular search engine. But this may change with more advanced models.