NYLJ “AI is Everywhere, But Where Can You Sue It? Court Examines Jurisdiction In the ‘Age of the Internet and AI’”

Sep 29, 2025
11 min read

By Steve Kramarsky

This summer, the big news on the Artificial Intelligence (AI) litigation front has been coming primarily out of California.

Two widely reported decisions from the Northern District of California addressed the unauthorized use of copyrighted material for the training of large language models (LLMs), and recently the news has focused on the defendant in one of those cases (Bartz v. Anthropic PBC, No. C 24-05417 WHA, 2025 WL 1741691 (N.D. Cal. June 23, 2025)) seeking to settle with authors whose works it acquired illegally.

It is important to note, in that context, that the court in Anthropic held that the use of copyrighted works to train AI models is fair use, requiring no payment or prior authorization, provided that the works are acquired legally.

Anthropic is offering a reported $1.5 billion settlement not because it trained its models on copyrighted material (which is permissible), but because it “pirated over seven million copies of books” instead of paying for them.

Data acquisition for training is costly, but if the training itself is legally protected as fair use, there is no reason to believe those costs will be ruinous to the underlying business model.

More challenging legal and economic questions arise, however, when courts look beyond the training phase. The California cases focus on model training and the works used in that process, not the results the AI models generate.

In Anthropic, for example, the “authors challenge[d] only the inputs, not the outputs, of these LLMs.” But the cases pending in New York contain more complex claims, including copyright and trademark infringement claims arising from the outputs of LLMs and the methods used to shape or update those outputs after training.

One such case is the pending suit by Dow Jones and NYP Holdings (owners of the Wall Street Journal and New York Post, among others) against Perplexity AI over its “answer engine”—a product that allegedly allows users to “skip the links” to the publishers’ websites and get answers directly from the AI based on articles in those publications. Dow Jones & Co., Inc. v. Perplexity AI, Inc., No. 24 CIV. 7984 (KPF), 2025 WL 2416401 (S.D.N.Y. Aug. 21, 2025).

That case “stands at the crossroads of artificial intelligence and intellectual property” and the court’s recent opinion on jurisdiction offers insight into current AI business and technological models and the legal issues that will take center stage as these matters progress.

Background: An Overview of the Technologies at Issue

Modern generative AI products, so-called “transformer-based” models like ChatGPT, Claude, and Perplexity, work by taking a large set of data (such as a library of published books and articles), analyzing the relationships among the elements of that set (such as words or portions of words, referred to as tokens), and generating a complex map of those relationships.

This map consists of billions of numbers (parameters) that the model uses to transform a user request into a response. Creating these maps requires enormous amounts of input data and computational power and multiple phases of analysis and fine-tuning, but once the process is complete the parameters do not change.

The “P” in ChatGPT stands for “pre-trained,” reflecting the fact that, once training is complete, a model’s underlying relationship map is fixed.

In the real world, outputs of these models often turn out to be unsatisfactory or imperfect, so various techniques have been developed to improve performance without expensive re-training, but the law on those real-world use cases is still undeveloped. In Anthropic, the California court addressed the pre-release training process; but what about whatever happens next?

The Dow Jones case, and others like it, focus on the outputs of AI models and the systems used to fine-tune them after training. The process of generating output from a user’s prompt to an LLM is called “inference,” and it presents a host of technical and legal issues that are just beginning to enter mainstream discussion.

Perhaps the most familiar of these, and the one at the heart of Dow Jones, is the issue of poor result quality, including inaccurate results or “hallucinations.” Anyone following AI in the legal profession has read about lawyers and pro se litigants getting into trouble for using general purpose AI chatbots like ChatGPT to write legal briefs.

Those models are not purpose-built for legal work and do not include systems to “fine tune” their output for legal drafting, or guardrails to prevent inaccuracy; they merely predict plausible-sounding strings of text in response to the user’s request. A brief generated by those tools will often misstate the law or make up citations to non-existent cases, causing serious issues for the litigants who rely on them.

To address these kinds of shortcomings, many companies implement a process called Retrieval Augmented Generation (RAG). In RAG systems, the user’s query is first run against a trusted source (such as an internal document store or authoritative online database) and the results are then fed to the LLM to summarize and “repackage” into a natural-language answer.

This process greatly reduces the chance of inaccurate results, lowers inference costs, and can provide a means to link back to the source material for easier human checking.

Anyone who uses Google search has seen this in action: in addition to the traditional page of search result links, Google now provides an “AI Overview” at the top of the page which provides an “answer” to the user’s question with text and images, including links back to source material.

This saves the user the trouble of visiting the source pages, but legal issues can arise: if the websites generate their income from clicks, the RAG summary deprives them of that revenue, and if the AI “answer” is not accurate, the cited websites may suffer reputational harm from the attribution. These are the issues in the Dow Jones case.

RAG and the Claims in Dow Jones

In Dow Jones, plaintiffs are New York-based publishers incorporated in Delaware whose publications include, among others, The Wall Street Journal and the New York Post. Their revenues from original content come “predominantly from selling subscriptions to their digital publications and from online advertising” presented when consumers visit their websites. 2025 WL 2416401, at *1.

Defendant, a Delaware corporation based in San Francisco, is an AI company that developed an “answer engine” called Perplexity. Perplexity allegedly generates answers to user questions based on information from “authoritative sources.”

It generates output using an RAG system, which queries a database of content from “original sources” that Perplexity selects. The information from the RAG database is provided to an LLM which repackages the original, indexed content in its answers to users, just like the Google “AI Overview”. The materials selected by Perplexity for the RAG database allegedly include various intellectual property of plaintiffs, including copyrighted articles that Perplexity deems “trustworthy.”

Perplexity specifically advertises that its product allows users to “Skip the Links” to the publishers’ websites and instead access copyrighted content through queries. While this makes Perplexity’s product more accurate, plaintiffs argue that it diverts customers and critical revenue from their websites, and that the “primary and express purpose” of the products Perplexity sells in New York is to encourage and allow New York subscribers to consume their content without paying them. (Notably, after this opinion came down, Perplexity dropped the “Skip the Links” tagline.)

Broadly speaking, plaintiffs assert that Perplexity violates their intellectual property rights in three ways: first, by copying their copyrighted works as inputs to the RAG index, second by providing users with outputs that contain “full or partial verbatim reproductions” of their articles, and third by generating “made-up text (hallucinations) in its outputs” and attributing that text to their publications using their trademarks.

Personal Jurisdiction in the ‘Age of the Internet and AI’

The substantive issues raised in Dow Jones are fundamental to the future of RAG and will shape the legal landscape of generative AI, but the threshold question is: who will decide them? Perplexity, a San Franciso-based company, argues that the Southern District of New York lacks personal jurisdiction over it and these issues should be litigated on its home turf in California.

This argument may seem frivolous given Perplexity’s worldwide reach and its New York offices, employees, and Times Square advertising billboards, but “the court is mindful of the complexities posed by personal jurisdiction in the age of the internet and artificial intelligence” and provides a detailed analysis of the jurisdictional and venue arguments, focused on the technological context.

The federal copyright and trademark laws do not provide for nationwide service of process, so the court looks to the state’s jurisdictional rules to resolve issues of personal jurisdiction. Plaintiffs do not allege general jurisdiction over Perplexity (as it is not incorporated or headquartered here); instead, they rely on New York’s long-arm statute, which permits the exercise of specific jurisdiction over foreign corporations under certain circumstances.

The relevant sections of the long-arm statute here are CPLR §302(a)(1) and CPLR §302(a)(3), and the court finds jurisdiction under both. Its analysis of jurisdictional and venue issues is lengthy, but a few points are worth noting.

Under CPLR §302(a)(1), jurisdiction exists where an entity transacts business in New York and the action “arises out of” that business. For the first part of the analysis, the court noted that Perplexity does business in New York in a manner “similar to that of a traditional business.”

It is registered to do business in New York, rents office space and has employees here, and targets advertising to New York customers. Leaving that aside, Perplexity has a “highly interactive” website that allows customers to buy and use its services directly in New York.

The Second Circuit has held that merely having an informational website accessible in New York is not sufficient to establish jurisdiction here (since the internet is essentially available everywhere), but a fully interactive site that permits New York users to transact with company can constitute transacting business in New York for purposes of §302(a)(1) if an action arises from those transactions.

Perplexity’s website is not only interactive, it targets New York users, for example with advertising aimed at using the product to “discover” New York landmarks, restaurants and attractions. Given Perplexity’s physical and internet presence, the court here holds that it transacts business in New York.

Perplexity argues, however, that it should not be subject to jurisdiction because plaintiffs’ claims do not arise from its New York business. It attempts to “silo” its New York presence, noting that its engineers, web-crawlers, and computer infrastructure are located primarily in California. It argues that the part of its business responsible for any tortious conduct is not located in New York, so any harm cannot “arise out of” its New York business activity.

The court rejects this overly formal analysis, noting that the claims relate to Perplexity’s website: sales are made into New York through the website, and any infringing results are presented there. Since the entire company supports the website, in both California and New York, the claims sufficiently arise out of the New York business presence to support jurisdiction under §3201(a)(1).

Having found jurisdiction under §302(a)(1), the court also looks to §302(a)(3) as a separate basis for its ruling. Under that section, specific personal jurisdiction is available over a defendant who commits a tort outside of New York that causes harm within the state, if the defendant could reasonably have expected the act to have consequences in New York and derives substantial revenue from interstate or international commerce. This test has several factors, all of which the court addresses, but the most interesting is the place of the injury.

The alleged infringements of plaintiffs’ trademarks and copyrights are qualifying torts, and those torts “occur” in California and Virgina, where Perplexity’s website is created and hosted. The alleged tortious conduct thus occurs outside of New York; but where is the harm felt? In internet trademark cases, the in state injury requirement “is satisfied by harm and threatened harm resulting from actual or potential confusion and deception of internet users” in New York.

Here, plaintiffs allege that internet users in New York use the Perplexity answer engine and are confused, believing that sometimes erroneous generated content comes from the Wall Street Journal or the New York Post. This kind of customer confusion as to source is sufficient to form the basis for jurisdiction in New York.

In copyright cases, harm in New York can include loss of business or customers in New York, which plaintiffs allege here. But not all on-line misappropriation of intellectual property passes this test, and the analysis can become fairly involved.

Here, plaintiffs allege that Perplexity’s unauthorized use of their materials caused the loss of advertising, licensing, and sales revenues from New York customers. The court analogizes these claims to “digital piracy” cases, in which the Second Circuit has held that a New York copyright holder whose works are uploaded to the internet for widespread sharing suffers injury in New York.

Recognizing that these cases have traditionally been limited to claims of internet filesharing, the court examines plaintiffs’ claims and alleged injuries and determines that, despite the different context, the piracy caselaw is a good jurisdictional fit. It therefore finds “the situs of injury to be New York, where plaintiffs hold the copyright to their works.”

The court then briefly addresses the remaining elements of Section 302(a)(3), holding that Perplexity could reasonably foresee being sued in New York (given its advertising and business efforts here) and that it derives substantial revenue from international commerce. It therefore finds personal jurisdiction proper under CPLR §3201(a)(3).

Having established personal jurisdiction under New York’s long-arm statute, the court also undertakes the required federal constitutional due process analysis. This analysis requires that a defendant have minimum contacts with the forum state and that trial of the matter in the selected forum “comport with fair play and substantial justice”.

Perplexity claims that it is “an early-stage startup with limited resources for cross-country litigation,” but the court notes plaintiffs allegation that defendant “recently raised $500 million and is valued at $9 billion,” making its assertion ring somewhat false. It therefore holds that trial in New York does not offend due process.

Finally, the court addresses and denies Perplexity’s motion to dismiss under Rule 12(b)(3) for improper venue. The standard on this motion includes numerous factors, most of which the court finds irrelevant or neutral where, as here, both litigants are large companies with nationwide reach.

The court notes that both the Southern District of New York and the Northern District of California are familiar with the governing federal intellectual property laws, and that alleged “docket congestion” is not a reason to favor one court over another.

It further holds that, in the age of electronic discovery, the physical location of evidence is no longer relevant, and that Perplexity’s forum selection clause is not applicable to plaintiffs, as they were not customers. It therefore holds that plaintiff’s choice of form is appropriate, and the case can go forward in the Southern District of New York.

Why It Matters: The Economics of AI and Search

The Dow Jones opinion is long and thorough and offers excellent analysis of the issues around personal jurisdiction on the internet, but one might ask why it exists. To an outsider, many of the arguments presented seem barely colorable. Why is Perplexity fighting so hard to get out of New York and back to its home court in California?

This case, and others like it (including a very similar one filed in September by Meriam-Webster and Encyclopedia Britancia) challenge the fundamental economics of AI search. One of the most widespread use cases for generative AI is as a replacement for Google: rather than Google a question, many people now “ask ChatGPT”.

The result is an answer generated by an LLM using RAG (or a similar technology) on results pulled from the open web. The upshot is that users can get answers (albeit of varying quality) without ever “clicking-through” to the pages that own and publish the content.

That supports the business of the AI companies—in fact they argue they could not exist without it—but the web publishers that rely on clicks for advertising or subscription revenue are deprived of that revenue and lose the incentive to create or sponsor more content.

Without some rebalancing of those economics, the success of the “answer engines” will likely lead to the death of the open web that built them.

Given that these could be existential issues for both sides, it’s not surprising that the litigants are fighting hard for any advantage, and the AI companies apparently believe that the courts serving Silicon Valley will be more tech-friendly, and less protective of traditional media, than those in New York. With Dow Jones going forward in the SDNY, we will soon find out if they are right.

This article first appeared in the New York Law Journal on September 29, 2025.