Tech World in Turmoil: AI Lawsuit Bombshells & Cyber Attack Mayhem!
NY Times copyright suit wants OpenAI to delete all GPT instances. One of the sources used is a large collection of online material called "Common Crawl," which the suit alleges contains information from 16 million unique records from sites published by The Times. Advertisement OpenAI no longer discloses as many details of the data used for training of recent GPT versions. However, all indications are that full-text NY Times articles are still part of that process (Much more on that in a moment.) "Defendants’ GenAI tools can generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style, as demonstrated by scores of examples," the suit alleges. "Publicly, Defendants insist that their conduct is protected as 'fair use' because their unlicensed use of copyrighted content to train GenAI models serves a new 'transformative' purpose," the suit notes. "A GPT model completely fabricated that “The Ne...