This development has long raised questions about the legitimacy of technological self-help measures and boilerplate contractual terms that conflict with public policy. With the rapid growth of generative AI, these concerns have become urgent. Data is essential for training, updating, and overseeing AI systems, yet platforms increasingly use technical barriers and contracts to restrict access to publicly available data. Disputes such as LinkedIn’s challenge to hiQ Labs and Cloudflare’s recent accusations against Perplexity AI highlight a broader struggle over who controls the data needed for AI development and oversight. While content providers have legitimate interests in protecting ownership, privacy, safety, and cybersecurity, those who bypass barriers to access public data may also have valid public-interest reasons. The balance between these interests should be set by public policy, not unilateral private power.
The talk examines the legal challenges raised by these conflicts and argues that access to data in the AI era is too important to leave to private ordering. The law must limit excessive data privatization to ensure equitable, reliable access that supports research, accountability, and innovation.
Further information and registration here