News orgs are raging against the Wayback Machine
What some consider to be the digital library of Alexandria is in danger of losing valuable scrolls. Major media outlets are blocking the Internet Archive’s Wayback Machine from saving web pages to prevent AI giants from training models on snapshots of old articles. Wired reported that 23 news organizations, including USA Today and the New York Times, are among the 241 sites denying Internet Archive’s web crawler access to their articles. It’s not personal—some outlets still use the Archive in their reporting—it’s about the looming threat of AI:
Publishers can archive their material, but a third party maintains a more incorruptible version of stories that can hold outlets accountable when it’s revised after publication. Nothing new: Last year, Reddit barred the Wayback Machine from data scraping for similar AI concerns. The archive also lost a slew of information when federal government websites were deleted. Still working: Graham is reportedly in talks to regain access to the material, while more than 100 media workers signed a letter supporting Wayback. |

No comments:
Post a Comment