Archiving and Accessing Web Pages (09 Nov 2004)
These web sites may be captured as part of wholesale, periodic intranet "snapshots" or when making backups for disaster recovery. However, these copies are not made with the purpose of long-term preservation or accessibility. Therefore, the sites on the GSFC intranet are subject to much the same instability as public Internet web sites-sites that have been moved, replaced or eliminated entirely. The goal of the Web Capture project is to provide a web application that captures web sites of long-term scientific and technical interest, stores them, extracts metadata, if possible, and indexes the metadata in a way that the user can search for relevant information.
Article URL: http://www.dlib.org/dlib/november04/hodge/11hodge.html
Read 63 more articles from D-Lib Magazine sorted by
date,
popularity, or
title.
Next Article: Assessing the Durability of Formats in a Digital Preservation Environment
|