Training AI models requires ๐บ๐ฎ๐๐๐ถ๐๐ฒ ๐ฎ๐บ๐ผ๐๐ป๐๐ ๐ผ๐ณ ๐ฑ๐ฎ๐๐ฎ, often scraped from the web. But is all this online content truly ๐ณ๐ฟ๐ฒ๐ฒ ๐๐ผ ๐๐๐ฒ? Microsoft Thinks So!
๐๐
๐ถ๐๐๐ถ๐ป๐ด ๐๐ฎ๐๐?
Copyright law protects original works of authorship, including content available on the internet. The "Fair Use" Doctrine allows limited use of copyrighted material without permission for purposes such as criticism, comment, news reporting, teaching, scholarship, or research. However, the determination of fair use depends on several factors, including the purpose and character of the use, the nature of the copyrighted work, the amount used, and the effect on the market value. U.S. Copyright Office Fair Use Index- https://lnkd.in/g_Qhyjqm
๐ช๐ต๐ฎ๐ ๐ ๐ถ๐ฐ๐ฟ๐ผ๐๐ผ๐ณ๐ ๐ง๐ต๐ถ๐ป๐ธ๐?
Mustafa Suleyman, Microsoft AI head, suggests that content available on the open web is fair game for training AI models unless explicitly restricted. He believes the norm since the 1990s has been that online content is essentially "freeware" for copying and reproduction. According to Suleyman, unless a publisher or news organization explicitly states "do not scrape or crawl," AI companies can use their content for training models.
๐๐ฎ๐๐๐๐ถ๐๐ ๐๐ด๐ฎ๐ถ๐ป๐๐ ๐ข๐ฝ๐ฒ๐ป๐๐ ๐ฎ๐ป๐ฑ ๐ ๐ถ๐ฐ๐ฟ๐ผ๐๐ผ๐ณ๐?
Despite Suleyman's stance, several legal challenges have emerged. In May, eight American newspaper publishers, including major newspapers like the New York Daily News and the Chicago Tribune, sued Microsoft and OpenAI for unauthorized use of their articles in AI products. Additionally, the Center for Investigative Reporting (CIR) filed a lawsuit against these companies for using their copyrighted material without permission or compensation. Few examples-
https://lnkd.in/ggmRaRi2
https://lnkd.in/g7_ATivW
https://lnkd.in/gEpDjAft
๐ข๐ฝ๐ฒ๐ป๐๐'๐ ๐ฅ๐ฒ๐ฐ๐ฒ๐ป๐ ๐ฃ๐ฎ๐ฟ๐๐ป๐ฒ๐ฟ๐๐ต๐ถ๐ฝ๐?
Amid these controversies, OpenAI hasย strategically completed partnerships with Financial Times, Stack Overflow, Reddit, News Corp, Vox Media, The Atlantic, Apple,Time
Our ๐ง๐ฎ๐ธ๐ฒ?
Just because something is online doesn't mean it's freely for all uses to do anything with it. Existing lawsuits against OpenAI claim it used copyrighted material without permission, removed copyright info, reproduced and distributed works, and created unauthorized derivative works which can harm market value of the original work. This can not be ignored. While Microsoft's view on content use is surprisingly too simplistic given the sensitivity of the issue, I highly admire OpenAI's partnerships with major content providers showing a move towards more responsible and cooperative content use.
Discussion about this post
No posts