AI’s Billion-Dollar Copyright Battle Starts With a Font Designer

December 18, 2023, 10:02 AM

Share To:

Matthew Butterick was intrigued—but disturbed—when he first sat at his computer to play with an early version of an artificial intelligence programming tool called Copilot.

The tool let him start typing a line of code and then filled in the rest, like autocomplete for programmers. But Butterick, a prominent figure in the programming community, also recognized Copilot was “trained” on billions of lines of open-source code, including his own, without permission.

Open-source programming, a pillar of the tech world, encourages developers to freely distribute their code and let others build on it. The catch: Those who reuse and modify code must credit the original programmer and abide by their licensing terms or restrictions.

Copilot appeared to run roughshod over that—no credit, no adherence to creators’ terms.

So Butterick wrote a blog post warning the tool posed a serious intellectual property problem. Its title: “This Copilot is stupid and wants to kill me.”

That post, in June 2022, ignited a wave of closely watched legal challenges to the booming AI industry—ones that could cost it billions of dollars and alter the very foundation on which AI is built.

The makers of Copilot, which include OpenAI Inc., Microsoft Corp., GitHub Inc., and other top AI companies, are now facing nearly a dozen lawsuits from authors, artists, and programmers. They claim the industry has vacuumed up their creative work—without consent or compensation—to train AI chatbots and image generators that are already beginning to replace them.

At the core of these novel cases sits Butterick, a typographer and lawyer hailed by some for leading the fight to holding AI accountable, and slammed by others as a Luddite and an obstacle to transformative technological advances.

The 53-year-old Californian built an eclectic career choosing less-traveled paths and making bets about the future. Peering around the corner, he sees a scenario where AI, without proper guardrails, swallows up the world’s creatives, including himself.

“My career as a programmer, as a designer, as a writer, I felt like it was over if this AI thing wasn’t addressed,” he said during an interview last month in San Francisco, across the street from Microsoft’s newest AI innovation lab. “I want there to be a habitat to return to at the end.”

Photographer: Alisha Jucevic/Bloomberg Law

Excited by Type

As a nerdy kid in suburban New Hampshire in the 1980s, Butterick’s handwriting was awful. So he cherished the gift he got on his 10th birthday: a manual typewriter.

“The idea that I actually had a device that could make nice, clean type was quite exciting for me,” he said.

At Harvard University in the early 1990s, Butterick hung out at the Bow & Arrow Press, a small studio in an artsy dorm called Adams House. Those years were formative. He learned to typeset and print using 15th-century technology, while mastering digital typesetting and graphic design on Macintosh computers.

“When he gets that bit between his teeth, he is just focused,” said Richard Nash, a friend and fellow Adams House resident. “You just knew that this was this kid’s obsession.”

Fonts that Butterick has designed.

After graduating in 1992, Butterick worked for a few years designing electronic typefaces. Then someone showed him an internet browser, and his world changed. He moved to San Francisco and founded Atomic Vision, a company that designed webpages for businesses.

Being a bright-eyed tech hopeful in northern California back then was akin to being a young, aspiring actor in Hollywood: No one had money, everyone worked out of their apartments, days started early and ended late.

Atomic Vision grew to more than 25 employees, with clients including Planned Parenthood and CNET. Butterick stayed on even after selling the company in 1999 to the open-source software firm Red Hat.

He was there when an existential battle erupted over the future of open-source programming. Microsoft was leading a campaign against open source, arguing that software based on copied and remixed code would result in a hodgepodge of intellectual property violations—a “cancer,” as former Microsoft CEO Steve Ballmer put it.

Insight: AI ‘Authorship’ Muddies the Waters of Copyright Law Claims

Open source won that war, leading to a vibrant and sustainable economy of hobbyists and companies, as long as everyone followed licenses. After a few years at Red Hat, Butterick decided to pivot from programming.

He enrolled at UCLA Law School. For his legal writing class, Butterick prepared his first assignment in Stempel Garamond, a popular publishing industry typeface. His professor directed him to submit the next paper in Courier, the bulky typewriter-era font.

“That’s just the rules, that’s how they do it in court,” he said the professor told him.

Butterick researched the guidelines in multiple courts and found that requirement was a myth. “It was just legal writing teachers being rigid,” he said.

He graduated in 2007, but the ensuing Great Recession proved a tough time to be a junior attorney. So Butterick leaned into a webpage he created, “Typography for Lawyers,” offering a detailed design guide for the presentation of legal briefs.

Butterick’s book is a must-read in the legal world.

It was an instant hit. The industry was in need of sleeker, more elegant-looking briefs. Butterick turned the online guide into a book still considered a bible for the profession. He also created a font called “Equity” that is widely used by courts, government agencies, and law firms.

“I’d be shocked if there is any legal writing professor in the country who is talking to their students about typography and doesn’t mention him, his website, or his book,” said Ruth Anne Robbins, a professor at Rutgers Law School.

His budding renown led him to San Francisco class-action lawyer Joseph Saveri.

Building a Case

A typography fan, Saveri had first reached out when Butterick’s website and book took off. He bought one of the fonts Butterick designed, and followed his blog in the years that followed.

Saveri was intrigued by Butterick’s warning about Copilot in the summer of 2022, and, in an email, asked if he thought a legal case might be viable.

That led to a series of phone calls. By that September, Butterick had reactivated his long-lapsed law license.

In November, the duo filed a federal copyright lawsuit against the makers of Copilot.

At the end of that month, OpenAI publicly released ChatGPT, and it instantly became a cultural and economic lightning rod, stirring predictions, celebrations, and warnings about how artificial intelligence would change the world.

Butterick and Saveri set their sights on other AI products presenting the same copyright problems as Copilot. In January they sued Stability AI Ltd. and Midjourney Inc., which sell text-to-image AI generators, on behalf of three visual artists who claimed their works were illegally pirated by the AI art tools.

Nashville painter Kelly McKernan discovered AI-created images that looked eerily familiar. Photographer: Kevin Wurm/Bloomberg

One of the artists, Nashville-based watercolor painter Kelly McKernan, designs book covers for fantasy novels and does other commissioned art gigs.

McKernan, who uses they/them pronouns, said they’ve already seen AI art tools creep into the market as direct competitors. Then McKernan saw social media followers post AI images that looked eerily like their own.

Using the website “Have I Been Trained?”—a tool that claims to determine if artists’ works are being used to train AI—McKernan discovered that their portfolio had been ingested into the AI models.

Photographer: Kevin Wurm/Bloomberg

If the lawsuit fails, “I might quit altogether,” McKernan said in an interview.

Butterick and Saveri brought a third group of cases this summer, representing authors claiming their books were used to train ChatGPT and Meta Platform Inc.‘s AI model called LLaMA.

To help publicize the cases, Butterick enlisted his high school friend from New Hampshire, Sarah Silverman, as a plaintiff. The comedian, author, and actor had been a strong advocate against movie studios using AI to replace humans, and her book “The Bedwetter” was being used to train AI chatbots. She also figured a case helmed by Butterick had a chance to succeed.

“He was the kid that’s, like, smarter than all the teachers—and the teachers know it,” Silverman later said on her podcast.

Many in the industry have hailed Butterick’s lawsuits.

“I think he saw this as one way that he could make a contribution that would ensure this technology doesn’t become the intellectual property of a few tech bros and billionaires,” said Russ Mitchell, a tech writer and former colleague at Red Hat.

In response to the lawsuits, the tech firms argue Butterick’s claims misrepresent the scope of copyright law, which they contend has built-in exceptions for innovations like AI. The companies have won some preliminary battles in the cases, but still unresolved is a key question: Does the law ban training an AI model on copyrighted work without permission?

“We’ll be raising that issue, and I think we will ultimately win,” said Mark Lemley, a Stanford law professor representing Meta and Stability AI.

Fair Use Defense

The Founding Fathers recognized the public should benefit from the work of writers and artists, so they gave Congress authority in the Constitution to create laws that incentivize the creation of novels, music, and art.

The current law dates to 1976 and gives creators a legal monopoly over their works by letting them decide who can copy and monetize them.

Copyright owners are often the first to express concerns about certain technological revolutions. A century apart, the music industry brought lawsuits over the piano roll, a device that could automatically play copyrighted piano pieces, and Napster, the once-dominant online song-sharing service.

Pamela Samuelson, a copyright law professor at the University of California Berkeley, argues creators shouldn’t be so quick to demonize new technology, even if it looks like it could be disruptive.

“Maybe some of this ‘doom, doom, doom’ is not going to be the way it ends up,” Samuelson said.

New technologies also bring benefits, she noted. Streaming services and platforms have democratized the music industry; artists are already using AI tools to perfect their work.

Copyrights are also fundamentally limited. Giving an artist too much control over their work could ultimately hinder creativity.

The Copyright Act attempts to strike a balance through a legal doctrine known as “fair use,” which allows copying under certain circumstances.

What constitutes fair use is an especially fuzzy line to draw. Typically courts must evaluate whether the copying has transformed the original work, whether it was done for a commercial purpose, and whether the entire piece or just a portion of it was copied. But the success of a fair use defense depends on the facts of each case.

AI companies and many legal experts have compared the current copyright battle to a groundbreaking case against Google.

In the early 2000s, the company began manually scanning millions of books to create a digital, searchable database. Its search results provided only snippets of the books.

The Authors Guild, the country’s largest professional writers’ organization, sued for copyright infringement in 2005, arguing Google scanned and posted the snippets without permission. A federal appeals court ruled that Google’s copying was fair use because the company’s goal of creating a database was “highly transformative.”

The copying was also for the purpose of providing factual information about the books. Facts, unlike artistic expression, are not protected by copyright.

Insight: Copyright Liability for Generative AI Pivots on Fair Use Doctrine

The Google Books case has been cited by AI companies defending their use of copyrighted works to train their systems. They argue AI models, like humans, “learn” about the images and texts, examining the statistical relationships between words and pixels so the model can predict the best possible response to a user prompt.

Butterick’s lawsuits don’t see it that way. Human learners don’t need to copy millions of books, unlike machine learners, he contends. He also believes the AI companies lean on shifting explanations to describe the models.

“Sometimes, it’s ‘Oh, it learns like a human,’ and other times when the model is doing something nasty, it’s ‘Oh, it’s a black box,’” Butterick said.

And unlike the Google Books case, AI companies are creating machines that could directly compete with the artists and writers whose work trains the machines, he said.

Butterick and Saveri’s cases are still in early stages and haven’t yet broached the fair use question. The legal industry and AI companies wait with bated breath.

“The views on this issue are as far apart as anything could possibly be,” Samuelson said.

‘Big Kahuna’ Claim

From the moment Butterick and Saveri began exploring a lawsuit over the Copilot tool, they’ve endured criticism from technologists, legal experts, and internet trolls. The two even reported getting death threats after announcing they were investigating Copilot for a potential legal case; it’s the reason the coders they represent in the initial lawsuit weren’t publicly identified.

The most important opinion, however, will come from the bench in a San Francisco courtroom. Butterick found himself there last month, dressed in a suit and black Converse high tops, as his partner faced questions from a judge clearly skeptical of some of their theories.

One theory is that copyright infringement happens at the output side of the AI models—that, for instance, because Meta’s AI model grows smarter by ingesting the entirety of Silverman’s book, every answer it spits out, and the model itself, ostensibly is an infringement of her work.

“My head,” Judge Vince Chhabria told Saveri, “explodes trying to understand that.”

Two weeks later, Chhabria rejected the portions of the complaint built on the output theory but said the duo could refile with better allegations and explanations.

Another judge reached a similar conclusion in the visual artists’ lawsuit against Stability AI and Midjourney, but also allowed them a second shot.

In that case, Butterick and Saveri filed an amended complaint adding evidence they claim shows AI outputs that look like the artists’ work.

The pair also contend that infringement occurs at the input side, when AI companies scrape huge swaths of text and images to train their models. Samuelson, the Berkeley professor, calls that theory the “the big kahuna” claim.

For procedural reasons, the AI companies haven’t yet contested the input theory, but they will soon raise the fair use defense in court.

Still, the lawsuits don’t have universal support in the creative community. Bradley Kuhn, director of the open-source advocacy group Software Freedom Conservancy, similarly believes that Microsoft’s Copilot completely disregards open-source licensing.

But talks with Butterick and Saveri to make the group one of the named plaintiffs in the Copilot suit collapsed, Kuhn said, because the lawyers were focused more on getting money for the programmers than changing the system.

“They don’t focus on what the goal of open-source licensing is, which is to keep software freely available for everyone,” Kuhn said.

Butterick declined to address Kuhn’s specific assertion, but said he doesn’t mind most of the criticism. He might make some mistakes, he acknowledged. But the way he sees it, the AI companies have been opaque, and as an attorney his job is to poke and prod his way to the truth.

Butterick also reiterates frequently that he and his clients aren’t opposed to AI as a technology. He’s even used AI techniques to program his own digital page layouts. But he believes the future of the creative economy rests on establishing a stable relationship between human creators, AI companies, and their customers.

If the companies continue without compensating artists, the triangle breaks down: The human artists will be out of work, the machines will have less content to train on, and the finished product will suffer.

“I don’t take the view that I can’t do anything unless the law is changed,” he said. “What’s important is that we get into court now and we take the legal tools we’ve got and apply them.”

To contact the reporter on this story: Isaiah Poritz in Washington at iporitz@bloombergindustry.com

To contact the editors responsible for this story: John P. Martin at jmartin@bloombergindustry.com; James Arkin at jarkin@bloombergindustry.com