AI-generated data can poison future AI models
Thanks to a boom in generative artificial intelligence, programs that can produce text, computer code, images and music are readily available to the average person. And we’re already using them: AI content is taking over the Internet, and text generated by “large language models” is filling hundreds of websites, including CNET and Gizmodo. But as AI developers scrape the Internet, AI-generated content may soon enter the data sets used to train new models to respond like humans. Some experts say that will inadvertently introduce errors that build up with each succeeding generation of models.