Wikipedia has a solution for the deluge of AI training bots hogging its servers
You're not the only one who turns to Wikipedia for quick facts. Lately, a deluge of AI bots training on Wikipedia articles has put enormous strain on the organization's servers. To curb the influx of "non-human traffic" scraping the site for training data, Wikipedia is taking a proactive approach: serving up its data directly to AI developers. On Wednesday, the Wikimedia Foundation announced a partnership with Google-owned company Kaggle to release a beta dataset "featuring structured Wikipedia content in English and French." Uploaded on April 15, the company said the dataset "simplifies access to clean, pre-parsed article data that's immediately usable for modeling, benchmarking, alignment, fine-tuning, and exploratory analysis."
Apr-21-2025, 17:25:57 GMT
- Industry:
- Technology: