Thoughts on the inevitable attempts by organizations to control the use of LLMs by their members

This is a quick run-down of the major concerns I’ve heard voiced by organizations — mostly corporations and schools — about Large Language Models (LLMs) like ChatGPT, and to a lessor extent about generative neural networks (gNNs) like Midjourney. It is admittedly cynical in the sense that taken together it sounds somewhat defeatist, but also realistic in the sense that there is too much money involved to not have this rammed down the throats of consumers of technology by the leading companies in the industry, and even if it fails completely to catch fire, that cycle will last about five years before industry titans like Microsoft and Google admit failure.

On the concerns about disclosure of “secrets” or the use of copyrighted material in training sets:

“Secrets” don’t exist if you have digitized them — the act of digitization is the act of publishing and broadcasting, both tautologically, because that is the nature of the digital medium, and practically, because we have repeatedly demonstrated that we can’t stop hackers, whistle-blowers, pirates, and leakers from redistributing digital content. And the law-maker class hasn’t yet shown itself capable of writing a reasonable set of rules around these problems, so we shouldn’t believe legislatures or courts can dramatically improve enough to craft something that fixes this.

On the belief that we can cordon off internal systems from the Internet at large to make them “AI-free zones”:

Enclosures around private versus public also don’t exist as more than conceptual frameworks we use to feel good in a nostalgic sort of way about being able to “defend our castle”. Hooking a device to a network that is itself connected to the Internet and then trying to control it’s access to the Internet (or the Internet’s access to it) creates a cost-center that will never be profitable and whose costs increase over time.

On the creation of policy that coerces members of the organization to only use LLMs or gNNs in “approved” ways:

Today, there is a really no honest or realistic business justification to govern the use of LLMs, or gNNs except for the question of their accuracy. But, in their current state, it might be reasonably argued that their accuracy is no worse than an average human, and that they are more likely to become better in the near-term than the average of humanity. (In the future, such governance may expand to limit liability based on how such liability is assigned by statute or precedence which does not yet exist but might be viewed as inevitable.) The far tougher questions around governing the use of LLMs and gNNs in the social and ethical domains aren’t something business has historically cared about, so there is little basis to argue that they should suddenly be responsible for them now as nothing about ML in general or LLMs and gNNs in particular is more impactful than printing, electricity, telecommunications, or computation, all of which were adopted by industry without regard to their social or ethical impacts.

On the idea that organizations are empowered to deliberate these questions at all:

In general we don’t have a choice — the tools we use to operate our businesses will be injected with these technologies whether we want them or not. Within five years operating systems, office suites, Web browsers, workflow engines, development environments, and all manner of off-the-shelf technologies that we use as the infrastructure of our corporate and technical operations will only function as a derivative of these types of machine learning systems and we can’t afford to re-build that infrastructure in it’s pre-ML form.

On the proliferation of these tools:

I think we should look at them like encryption — no one wanted it when it was a hassle but now that it is incorporated to the table stakes of the way humans experience the Internet, the Web, and the walled-garden ecosystems no attempts by governments or corporations to weaken or back out of the widespread adoption of encryption for data both at rest and in motion is expected to succeed. LLMs and gNNs, like encryption, are the kind of technology that once people have it they refuse to give it back.

Posted

2023-03-03

law, technology

Ian

Tags: