- within Technology topic(s)
- with Senior Company Executives, HR and Finance and Tax Executives
- in United States
- with readers working within the Accounting & Consultancy, Business & Consumer Services and Technology industries
Introduction
The Delhi High Court is currently hearing India's first copyright dispute involving a generative AI platform. In ANI Media Pvt. Ltd. v. OpenAI Inc. & Anr. (CS(COMM) 1028/2024), instituted in November 2024, ANI alleges that OpenAI's ChatGPT unlawfully scraped and used its copyrighted news content, both freely accessible and paywalled, for training its large language model (LLM). This marks the first instance in India where copyright infringement claims have been directly tested against a GenAI system.
The case has attracted broad stakeholder intervention. The Federation of Indian Publishers (FIP) and the Digital News Publishers Association (DNPA) argue that unlicensed AI training undermines the credibility and economic sustainability of journalism. The Indian Music Industry has similarly contended that training AI models on copyrighted sound recordings without licences amounts to infringement and causes direct financial harm to rights holders.
Other intervenors have taken divergent positions. The Indian Governance and Policy Project (IGAP) has appeared as a neutral, policy-focused intervenor to assist the court on systemic implications. In contrast, Flux Labs AI, a startup, has warned that mandatory licensing for AI training data could stifle innovation and disproportionately disadvantage smaller developers.
Two amici curiae have been appointed. One has submitted that temporary storage of copyrighted material for training purposes may be lawful, and that the key inquiry is whether OpenAI used ANI's content beyond training. The other has taken the opposite view, arguing that OpenAI's activities violate copyright and that fair dealing cannot apply since OpenAI is neither a news agency nor engaged in criticism or review. On jurisdiction, both amici agree that the Delhi High Court has territorial jurisdiction, as ANI carries on business in Delhi.
In parallel, during a parliamentary session on February 7, 2025, the Ministry of Electronics and Information Technology clarified that web scraping of publicly available data for AI training is regulated under the Information Technology Act, 2000. Section 43 penalises unauthorised access, downloading, or extraction of data, potentially covering scraping without consent. The Ministry also noted that the Digital Personal Data Protection Act, 2023 obliges entities processing personal data - including through web scraping - to obtain informed consent and implement robust compliance measures.
Key Legal Issues
At the heart of the dispute is whether non-expressive or transitory reproductions created during AI training - such as tokenisation and vectorization - constitute 'copies' under copyright law. The court must decide whether these technical processes amount to unauthorised reproduction of ANI's works and whether the subsequent use of trained models to generate user responses infringes copyright.
Jurisdiction is another contested issue. OpenAI argues that since training occurred outside India, Indian courts lack jurisdiction to adjudicate the dispute.
On the merits, OpenAI submits that Section 52 of the Copyright Act, 1957 (covers fair dealing exceptions) need not be invoked where content is accessed from a legitimate source, such as a freely accessible website, as no infringement arises under Section 14. In the alternative, it relies on fair dealing for 'private or personal use' including research. Whether large-scale AI training can fall within the contours of fair dealing under Indian law is therefore central to the case.
ANI's Submissions
ANI argues that OpenAI's collection, storage, and use of its works for training constitutes infringement. It submits that training necessarily involves storing copyrighted material in some form and that tokenisation and vectorisation amount to unauthorised reproduction. ANI emphasises that public availability does not place works in the public domain. It rejects OpenAI's fair dealing defence, characterising the use as commercial, purposive copying rather than private or personal use. ANI also points to OpenAI's licensing arrangements with other media houses and claims that it is losing subscribers as a result of the alleged infringement, asserting that the balance of convenience lies in its favour.
OpenAI's Defence
OpenAI counters that copyright protects expression, not facts, and that news content - being largely factual - attracts narrower protection in light of the public's right to information. It argues that ANI has failed to clearly identify the specific works allegedly infringed. OpenAI further submits that LLM outputs are independently generated expressions and do not amount to adaptation or reproduction of ANI's works. It maintains that non-expressive elements such as grammar, syntax, and common phrases are uncopyrightable, and that copying solely to extract such elements is non-infringing. On fair dealing, OpenAI relies on Indian precedent - Syndicate of the Press of the University of Cambridge v. BD Bhandari and Rameshwari Photocopy - recognising that limited reproduction may qualify as fair dealing even in commercial contexts.
Where Matters Stand
Without conceding liability, OpenAI blocked ANI's domain from future training as of October 2024. The litigation now turns on how Indian courts will interpret Section 52 in the context of AI training and whether they will align with or distinguish emerging global jurisprudence.
Beyond the courtroom, policy reform is underway. On April 28, 2025, the Department for Promotion of Industry and Internal Trade (DPIIT) constituted a committee to examine AI–copyright issues and recommend reforms. Its first working paper on Generative AI and Copyright, released on December 8, 2025, represents a significant policy intervention.
After reviewing approaches in the US, UK, EU, Japan, and Singapore, the committee concluded that no single model adequately serves India's objectives. Zero-priced licensing weakens creator incentives; opt-out regimes are impractical; voluntary licensing is unworkable at scale; and absolute right-holder control risks fragmentation and hold-outs.
The committee therefore proposed a hybrid statutory licensing framework, featuring:
- A mandatory blanket licence coupled with a statutory remuneration right for copyright owners;
- Permission for AI developers to train on lawfully accessed copyrighted works, subject to royalty sharing through a government-designated non-profit collective;
- Centralised administration of royalties through a single-window body - the Copyright Royalties Collective for AI Training (CRCAT) - with rates set by a government-appointed committee; Even non-members would be eligible for payments upon registering their works in the 'Works Database'.
- Remuneration triggered at the commercialisation stage rather than during training, to preserve innovation incentives;
- A light-touch transparency requirement for AI developers to disclose sufficiently detailed summaries of training datasets; and
- Retroactive royalty obligations for commercially successful AI systems already trained on copyrighted content.
Stakeholder comments have been invited, with a deadline of February 6, 2026. Together, the ANI litigation and the DPIIT's proposals are likely to shape the contours of AI development and copyright enforcement in India.
The content of this article is intended to provide a general guide to the subject matter. Specialist advice should be sought about your specific circumstances.