GitHub on Wednesday announced its plans to launch the Archive Code Vault with the aim to preserve open-source software for future generations for at least 1,000 years.
The code-sharing site is partnering with the Long Now Foundation, the Internet Archive, the Software Heritage Foundation, Arctic World Archive, Microsoft Research, the Bodleian Library, and Stanford Libraries to ensure the long-term preservation of the world’s open-source software.
Also Read- Best GitHub Alternatives
“There is a long history of lost technologies from which the world would have benefited, as well as abandoned technologies which found unexpected new uses, from Roman concrete, or the anti-malarial DFDT, to the hunt for mothballed Saturn V blueprints after the Challenger disaster,” according to the GitHub announcement. “It is easy to envision a future in which today’s software is seen as a quaint and long-forgotten irrelevancy until an unexpected need for it arises. Like any backup, the GitHub Archive Program is also intended for currently unforeseeable futures as well.”
The company plans to store and preserve open-source software like Flutter and TensorFlow in an abandoned coal mine in Svalbard, Norway, in the event the earth is hit by possible doomsday scenarios like an apocalypse. The GitHub Arctic Code Vault is a data repository preserved in the Arctic World Archive (AWA), a very-long-term archival facility 250 meters deep in the permafrost of an Arctic mountain. The archive is located in a decommissioned coal mine in the Svalbard archipelago, closer to the North Pole than the Arctic Circle.
GitHub stores its data on specialized ultra-durable film, which is coated in iron oxide powder. This data can be read by a computer or a human with a magnifying glass in case of a global power outage. Remarkably, this film will last for 1,000 years.
Piql AS, a Norwegian data storage tech company that makes the special film reels, said that they should last for up to 750 years in normal conditions, and perhaps even 2,000 years if stored in a cold, dry, and low-oxygen cave.
The reels are stored in a white container and GitHub plans to leave 200 such platters with each carrying 120 gigabytes of open source software code, in the vault. Among the first data deposit, open-source software codes to be stored at the vault included the Linux and Android operating systems and 6,000 other important open source applications.
“We’re excited to partner with Piql to help preserve open-source software for future generations,” said Kyle Daigle, director of special projects at GitHub.
“Piql’s custom film and archiving technologies will allow us to store terabytes of data on a durable medium designed to last for over 1,000 years. We’re delighted that next year every active public GitHub repository will be written to this film, and safeguarded in the Arctic World Archive in Svalbard, for the centuries and generations to come.”
GitHub is planning to capture a snapshot of every active public repository on 02/02/2020 and preserve that data in the Arctic Code Vault. The snapshot will include public code repositories as well as “significant dormant repos as determined by stars, dependencies, and an advisory panel”, according to GitHub.
“The snapshot will consist of the HEAD of the default branch of each repository, minus any binaries larger than 100kB in size. Each repository will be packaged as a single TAR file,” it adds.
“For greater data density and integrity, most of the data will be stored QR-encoded. A human-readable index and guide will itemize the location of each repository and explain how to recover the data.”
The advisory panel will include experts from a range of fields, including anthropology, archaeology, history, linguistics, archival science, and futurism.
Besides the GitHub Archive Program, the company is also working on Microsoft’s Project Silica to “archive all active public repositories for over 10,000 years, by writing them into quartz glass platters using a femtosecond laser.”