Digital Preservation Is Not Keeping Up With the Growth of Scholarly Knowledge – Slashdot
Nature: Millions of research articles are absent from major digital archives. This worrying finding, which Nature reported on earlier this year, was laid bare in a study by Martin Eve, who studies technology and publishing at Birkbeck, University of London. Eve sampled more than seven million articles with unique digital object identifiers (DOIs), a string of characters used to identify and link to specific publications, such as scholarly articles and official reports. Of these, he found that more than two million were ‘missing’ from archives — that is, they were not preserved in major archives that ensure literature can be found in the future.
Eve, who is also a research developer at Crossref, an organization that registers DOIs, carried out the study in an effort to better understand a problem librarians and archivists already knew about — that although researchers are generating knowledge at an unprecedented rate, it is not necessarily being stored safely for the future. One contributing factor is that not all journals or scholarly societies survive in perpetuity. For example, a 2021 study found that a lack of comprehensive and open archiving meant that 174 open-access journals, covering all major research topics and geographical regions, vanished from the web in the first two decades of this millennium.
A lack of long-term archiving particularly affects institutions in low- and middle-income countries, less-affluent institutions in rich countries and smaller, under-resourced journals worldwide. Yet it’s not clear whether researchers, institutions and governments have fully taken the problem on board. […] At the heart of the problem is a lack of money, infrastructure and expertise to archive digital resources. […] For institutions that can afford it, one solution is to pay a preservation archive to safeguard content. Examples include Portico, based in New York City, and CLOCKSS, based in Stanford, California, both of which count a raft of publishers and libraries as customers.