Monday, December 8, 2014

18. Digital Preservation Part II

इस ब्लॉग्स को सृजन करने में आप सभी से सादर सुझाव आमंत्रित हैं , कृपया अपने सुझाव और प्रविष्टियाँ प्रेषित करे , इसका संपूर्ण कार्य क्षेत्र विश्व ज्ञान समुदाय हैं , जो सभी प्रतियोगियों के कॅरिअर निर्माण महत्त्वपूर्ण योगदान देगा ,आप अपने सुझाव इस मेल पत्ते पर भेज सकते हैं - chandrashekhar.malav@yahoo.com

18. Digital Preservation Part II


P- 01. Digital Libraries*

By :Jagdish Arora, Paper Coordinator

Glossary

A

Access
Ongoing usability of a digital resource, retaining all qualities of authenticity, accuracy and functionality deemed to be essential for the purposes the digital material was created and/or acquired for.
Access System Redundancy
When an entire system is running over two or more computers in two or more data centers. This is an excellent way to ensure that there is little interruption to near-term, ongoing access, but it does not alone guarantee usability, authenticity, or accessibility of the content over the long-term.
AIP (Archival Information Package)
In the OAIS conceptual model, a collection (package) of content and preservation description information which is preserved in an OAIS-compliant archive.
Archive
i) An organisation whose function is the preservation of resources, either for a specific community of users, or for the general good; ii) the collection of resources so preserved.

B

Backup
Content is copied and stored in multiple locations. A well-managed backup system can help to quickly resolve problems with content encountered this week, or next week, or next month, but on its own is insufficient over the long-term.
Byte Replication
A process whereby identical, multiple copies of files, file systems, or websites are created. However, simple byte replication includes no provision to ensure the content is usable when the file formats are no longer current, nor is there any inherent provision to ensure that the content remains discoverable.

C

CLOCKSS
A not-for-profit joint venture started by libraries and publishers committed to ensuring long-term access to scholarly publications in digital format. Built on low-cost, open-source, award-winning LOCKSS technology, CLOCKSS's decentralized, geographically disparate preservation model ensures that the digital assets of the community will survive intact.

D

Digital preservation
The processes of maintaining accessibility of digital objects over time.
DIP (Dissemination Information Package)
In the OAIS conceptual model, a collection (package) of content and preservation description information which is delivered to the end user from an OAIS-compliant archive.
DIP (Dissemination Information Package)
In the OAIS conceptual model, a collection (package) of content and preservation description information which is delivered to the end user from an OAIS-compliant archive.

H

Hierarchical Storage Mechanisms (HSM)
A data storage mechanism where the most frequently used data is kept on fast disks while less frequently used data is kept in nearline such as an automated (robotic) tape library. An HSM can automatically migrate data from tape to disk and vice-versa as required.

I

Ingest
The process of loading a digital file into a digital repository along with its descriptive metadata for subsequent retrieval is referred to as ingest.

L

Liaison and Advocacy
The preservation programme and libraries must advocate good practices among producers of digital contents with an aim to facilitate long-term availability of the material for which the programme will be responsible.
Lockss
An open-source, library-led digital preservation system built on the principle that “lots of copies keep stuff safe.”

M

Metadata
A set of data that describes and gives information about other data.
METS (Metadata Encoding and Transmission Standard)
An XML schema for packaging digital object metadata.
Microfilming
A type of photographic process that used to produce reduced size images of textual or graphic material on film. In this process a master is produced from which further copies can be made.

O

OAIS (Open Archival Information System)
An archival model that has accepted responsibility to preserve data and make it available to designated communities.
OCLC
A nonprofit, membership, computer library service and research organization dedicated to the public purposes of furthering access to the world’s information and reducing library costs.



0. Objectives

The objectives of this module are to discuss on the following aspects of digital preservation:

  • Digital preservation metadata, models and standards
  • Major functions of a preservation programmes
  • Storage management of digital preservation
  • Microfilming: a hybrid solution for digital Preservation Process
  • Major Digital Preservation Programmes initiatives in India and world-wide

1. Introduction

This is the second of a two-part module on digital preservation.  Some key points discussed in the first part of the module are as follow:
  • Need, relevance, problems and challenges of digital preservation;
  • Principles that guide digital preservation actions;
  • Factors that are involved in long-term digital preservation;
  • Digital preservation strategies; and
  • Impact of intellectual property rights and digital rights management on digital preservation.

Preservation metadata is one of the vital building blocks in the digital preservation programmes. Without preservation metadata, digital material will be lost. Preservation metadata provides the vital information which will make “digital objects self-documenting across time. This part of the module provides an overview of preservation metadata, metadata models, standards and  functions of digital preservation programmes. This module also discusses patterns of storage management and microfilming as a hybrid model being practised for long term preservation. The module lutline selected world-wide and Indian initiatives towards digital preservation programme. 

2.0 Digital Preservation Metadata

Digital preservation metadata provides structured ways to describe and record information needed to manage the preservation of digital resources. It is one of the vital building blocks in the digital preservation programmes and essential in order to establish and document the authenticity, integrity, provenance and trustworthiness of digital objects. Effective digital preservation depends on a set of preservation services such as conserving materials, digitizing collections, preserving library content in digital formats, and providing robust education and outreach programs,  that needs digital preservation metadata to work together to ensure that digital objects are preserved for long-term. Three types of preservation metadata are described below: 


2.1 Open Archival Information System (OAIS)

The OAIS Reference Model was developed by the NASA’s Consultative Committee for Space Data Systems (CCSDS) as a conceptual framework describing the archival systems for the long-term preservation of digital data (Lee, 2010). OAIS has now been adopted as an ISO standard (ISO 14721:2003). The model establishes terminology and concepts relevant to digital archiving, and proposes an information model consists of the following three components for digital objects and their associated metadata:

i)        Submission Information Package (SIP):  SIP is the content and associated metadata “ingested” into the repository at the time of deposit;
ii)      Archival Information Package (AIP): AIP is the content and associated metadata actually stored and managed by the repository over the long-term; and
iiI)   Dissemination Information Package (DIP): DIP is the content and associated metadata provided by the repository in response to access requests by users. 
The thematic diagram of OAIS Reference Model, along with the three variations of information packages, is shown in Figure 1.
Alternate Text
The metadata in OAIS Model plays an essential role in preserving digital content and supporting its use over the long-term. The OAIS information model implicitly establishes the link between metadata and digital preservation – i.e., preservation metadata. The OAIS reference model provides a high-level overview of the types of information needed to support digital preservation that can broadly be grouped under two major umbrella terms called i) Preservation Description Information (PDI); and ii) Representation and Descriptive Information. 

2.1.1      Preservation Description Information

The preservation description information consists of the following four major types of metadata elements:

i)        Reference Information: Reference information is used for identification or description and includes global identifiers, identifiers local to the archive and some method of pointing to an existing descriptive metadata record for the resource such that it can be referred to unambiguously, both internally and externally to the archive (e.g., ISBN, URN).

ii)      Provenance Information: Documents the history of the content information (e.g., its origins, chain of custody, preservation actions, effects and rights management) and helps to support claims of authenticity and integrity.

iii)    Context Information: Documents the relationship of the content information to its environment, how it is related to other object, how it is related to other manifestations of the same object and how it is intellectually related to other objects. (e.g., why it was created, relationships to other content information).

iv)    Fixity Information: Documents authentication mechanisms used to ensure that the content information has not been altered in an undocumented or unauthorized manner (e.g., checksum, digital signature).    

2.1.2      Representation and Descriptive Information

Representation information facilitates proper rendering, understanding, and interpretation of a digital object's content. At the most fundamental level, representation information imparts meaning to an object’s bit-stream. For example, it may indicate that a sequence of bits represents text encoded as ASCII characters and furthermore, that the text is in French. The depth of the representation information required depends on the designated community for whom the content is intended. Descriptive Information metadata contains more ephemeral metadata, the information used to aid searching, ordering, and retrieval of the objects.



2.2 PREMIS (PREservation Metadata: Implementation Strategies)

In 2003 OCLC and RLG established Preservation Metadata: Implementation Strategies (PREMIS), an international working group. Composed of more than thirty international experts in preservation metadata, PREMIS sought to: i) define a core set of implementable, broadly applicable preservation metadata elements, supported by a data dictionary; and ii) identify and evaluate alternative strategies for encoding, storing, managing, and exchanging preservation metadata in digital archiving systems. In September 2004, PREMIS released a survey report describing current practice and emerging trends associated with the management and use of preservation metadata to support repository functions and policies. The final report of the PREMIS Working Group was released in May 2005. The PREMIS Data Dictionary is a comprehensive, practical resource for implementing preservation metadata in digital archiving systems. It defines implementable, core preservation metadata, along with guidelines and recommendations for management and use. PREMIS also developed a set of XML schema to support use of the Data Dictionary by institutions managing and exchanging PREMIS conformant preservation metadata.

Figure 2 provides the PREMIS data model consisting of five entities associated with the digital preservation process: Intellectual Entity (a coherent set of content that is described as a unit: e.g., a book); Object (a discrete unit of information in digital form, e.g., a PDF file); Event (a preservation action, e.g., ingest of the PDF file into the repository); Agent (person, organization, or software program associated with an Event, e.g., the publisher of the PDF file who deposits it in the repository); and Rights (one or more permissions pertaining to an Object, e.g., permission to make copies of the PDF file for preservation purposes). Each entity is described by a set of properties called semantic units. Each semantic unit represents a discrete piece of information to be recorded as part of the metadata supporting the digital preservation process. In short, a PREMIS semantic unit can be recorded in any way a repository finds convenient, given the processes and architecture of its repository system, as well as its metadata management procedures. A semantic unit can be recorded as a single metadata element, or broken up over multiple metadata elements if the repository prefers.
Alternate Text


2.3 METS: A Standard for Packaging Metadata and Content together in Digital Repository System

METS (Metadata Encoding and Transmission Standard) is designed to organize and link different types of metadata to its associated content in the OAIS Reference Model. METS is an XML schema designed specifically as an overall framework within which all the metadata associated with a digital object can be stored. A METS file comprises of the following four major constituent sections:

i)        A file inventory for all the files associated with the digital object including still image files, text, video or audio files;
ii)      A section for administrative metadata;
iii)    A section for descriptive metadata; and
iv)    A structural map, which indicates in a hierarchical manner how the various components of the item relate to each other, so allowing its constituent elements to be navigated by the user.

These four sections are linked to each other by means of identifiers. The practical implementation of METS can be very flexible as well. Any system capable of handling XML documents can be used to create, store and deliver METS-based metadata (Lavoie, 2005). METS provides the ability to associate a digital object with behaviors or services. 



3.0 Major Functions of Preservation Programmes

The UNESCO’s Guidelines for the Preservation of Digital Heritage (2003), describes the following functions that a full-fledged digital preservation programmes should perform:


3.1 Creation of a Safe Place

Preservation programmes must identify or create a safe place for storing and managing digital materials. Organizations may either set-up its own infrastructure or outsource its digital preservation activity to a reliable third party. However, it is the responsibility of the concern organization to ensure long-term availability of their digital contents.

3.2 Ingest

The process of loading a digital file into a digital repository along with its descriptive metadata for subsequent retrieval is referred to as ingest. The steps involved in ingest include:

  • Applying collection policies and selection criteria to ascertain whether material can be accepted on submission or not;
  • Checking the quality of the material submitted, including its completeness, authenticity and ascertain that the material has duly been scanned for viruses;
  • Assigning unique identifiers to digital objects;
  • Assigning and managing copyright of digital material;
  • Assessing the elements that must be maintained, and assigning preservation objectives;
  • Setting retention and review periods for the digital material as deemed appropriate;
  • Checking and upgrading the documentation that describes the material, including the technical and preservation metadata;
  • Checking the file format(s) and converting them into another format if so desired to comply with the policy of digital preservation programme;
  • Saving digital objects and associated metadata after verification to the archival storage system  

3.3 Archival Storage

A digital preservation programme should provide archival storage that maintains, protects and verifies the integrity of the stored digital objects and associated metadata, whether stored as a single data stream or as separate but linked data streams.

The archival storage system must include practices to ensure protection of data stream from unintended change, damage or loss which can be achieved by regular copying of the data stream to fresh media, or to new media types, when necessary. Storage practices should also includes regular checks such as:  checking of data stream against corruption, system security; backup regimes that place copies at remote sites and disaster recovery plans that address contingencies such as complete loss of the system’s operating infrastructure.


3.4 Preservation Planning

The basic function of preservation planning is to monitor threats to accessibility to digital material and to specify action required to counter such threats. While archival storage offers data protection, its continuing access has to be ensured. The technology changes that affect accessibility should be monitored. The remedial action may involve migrating or upgrading of the digital object into different format or encoding or changing the metadata that describes the means of access and links to current access tools.

3.5 Implementing Preservation Strategy

Preservation strategies discussed above differ substantially in their method of preserving digital records. However, the process of implementation is quite similar. Steps involved in implementing preservation strategies adopted by the National Archives of Australia (2007) are given below:

·  Identify Materials Requiring Preservation: Identify and select digital materials that require preservation treatments.

·  Research Appropriate Preservation Strategy: Investigate the hardware and software technologies required to successfully implement the preferred preservation approach. Different preservation strategies may have different pre-requisites. For example, in the case of emulation, it may involve the development of specialised software capable of re-creating the source records within a new computer environment. In the case of migration, this may involve identifying suitable migration paths (ie software applications with sufficient backward compatibility to transfer source records from an outmoded data format to a current data format). In the case of encapsulation, this may involve software with the ability to embed metadata or ‘package’ it with the record.

·  Test Proposed Solution: Before a preservation approach is fully implemented, comprehensive testing of the technical processes must be conducted. Testing should be performed on duplicates of source records. 

·    Back up Records Identified for Preservation: Prior to implementation, all digital records identified for preservation treatment should be backed up with its integrity duly verified. These duplicate source records should not be subjected to a preservation process and will serve as master copies should the selected preservation treatment be unsuccessful.

·   Apply the Preservation Treatment: After successful testing, the treatment should be applied to all digital records identified for preservation treatment. This treatment would vary from one preservation strategy to another, For example, for migration and encapsulation techniques, it would entail applying preservation treatments to the source records, thereby altering their format. For an emulation-based technique, the records identified for preservation would be transferred to the new environment – without altering the records themselves.

·     Audit the Integrity of Preserved Records: The preserved records should be subjected to rigorous testing to ensure that there is no loss of content and change in its structure or format. The integrity of all relevant metadata associated with the preserved records should be verified. Metadata should also be updated to record the preservation treatment.

·    Destroy Source Records where Appropriate: Once the preservation process has been completed and the integrity of the preserved records has been verified, duplicate source records may be destroyed.

·   Establish Monitoring Regimes:  The integrity of the preserved records, their functionality, structure, content and context, and associated metadata, should be monitored periodically following preservation to ensure the stability of the preserved records and to identify when subsequent preservation treatments are required.


3.6 Data Management

Managing digital materials in the archive generates its own data about what material is stored, what can be accessed, and about the management of the archive. This data must be managed to support use of the archive, and to support its effective administration.


3.7 Access

This function provides a user interface to the archive, allowing users to browse, search and discover its holdings, to request for material and receive its copies. Access to archives may either be restricted or it may be made available to all potential users. The access function may well require mechanisms to control access.


3.8 Liaison and Advocacy

The preservation programme and libraries must advocate good practices among producers of digital contents with an aim to facilitate long-term availability of the material for which the programme will be responsible. There is also a need to understand who would be the likely users of the material, so that preservation and access arrangements can be tailored to their needs and expectations.

3.9 Management, Administration and Support Functions

A digital preservation programme must be managed professionally. It involves development of policy frameworks and standards covering all areas of operations, the ongoing supply of appropriate resources and infrastructure including suitable technical systems, and in part management processes such as monitoring and reporting on the programme’s operations. The OAIS Reference Model is a high level conceptual framework that can be used as a reference point for those designing, using and evaluating real implementations.

4. Storage Management for Digital Preservation

One of the crucial threats to digital preservation is short life of storage media, obsolete hardware and software, and slower read times of old media. The selection and installation of software components are crucial while building digital preservation programme. The basic tenets of digital preservation extend much beyond storage media life. Devices used for reading storage media rapidly become obsolete, various formats (and their changing versions) of digital documents and images introduce additional complications. The storage operation in digital archives primarily address to the media level formatting of information objects. Primary considerations for storage of digital materials include levels of hierarchy and redundancy. A digital archive may have multiple levels of storage depending upon the levels of expected use and expected retrieval performance. Digital repositories that are too large to store on a single disk can use hierarchical storage mechanism (HSM). In an HSM, the most frequently used data is kept on fast disks while less frequently used data is kept in near-line such as an automated (robotic) tape library. An HSM can automatically migrate data from tape to disk and vice-versa as required. Digital material in a distributed network may be stored online in multiple locations. Besides offline and online storage, near-line storage may be adopted wherein information objects may be stored on optical or tape media and loaded in a jukebox. Retrieval time in near-line storage systems is higher in comparison to online storage, but is considerably more responsive to user demand than off-line storage. A digital archive may use any or all of these methods. The most sophisticated systems combine the resources so that objects in use or recent use are stored online and, as they age from the time of most recent use, they move to near-line storage and then eventually to off-line storage.

Redundancy is another important storage consideration. Effective storage management thus means providing for redundant copies of the archived objects to ensure availability of documents in case of loss. A number of RAID (Redundant Array of Inexpensive Disks) models are now available for greater security and performance. The RAID technology distributes the data across a number of disks in a way that even if one or more disks fail, the system would still function while the failed component is replaced. Digital archives may also choose to make backup copies on their own or to make arrangements for other sites to serve as backup.

Although hard disc (fixed and removable) solutions are increasingly available at an affordable cost, optical storage devices including WORM, CD-R, CD ROM, DVD ROM or opto-magnetic devices in standalone or networked mode, are attractive alternatives for long-term storage of digital information. Optical drives record information by writing data onto the disc with a laser beam. The media offer enormous storage capabilities. Some of the important features of storage infrastructure for satisfying requirements of digital preservation are as follows:

  • Increased scalability:  The storage media should be scalable depending on the requirement of a digital archive.

  • Availability of storage devices to multiple servers: The storage system should be a sharable device that can be accessible from multiple servers. Increased availability and sharing among storage devices allows for effective load balancing and redundancy. Intelligent storage networks and Network Attached Storage (NAS) are now available in which the physical storage devices are intelligently controlled and made available to a number of servers.

  • High-speed throughput: The storage device should utilize Fiber Channel, for carrying traffic between devices at high speed.

Separation from the LAN: The storage system attached to a digital repository should only be accessible via devices physically connected to it so that the storage system remains unaffected by traffic on the user LAN and vice versa.


5. Microfilming and Digital Preservation: A Hybrid Solution

Microfilming is a tried and tested technology for preservation of documents with proven longitivity. The life expectancy of microfilm is in the 500+ year range.  In 1992, renowned microfilm expert Don Willis drew upon developments in the infant technology of mass digital storage to suggest the possibility that microfilm and digital technologies could be combined to meet the needs of both archival storage and digital access. The proposed hybrid solution suggests microfilming of document as first step and then digitized from the film master. It is argued that for a computer image to match the resolution of high-resolution microfilm, the item would need to be scanned at over 600 dots per inch, which is practically impossible with prevailing scanning technology as it would require incredible scanning time and storage space. Moreover, neither the scanners are designed to scan at such a high resolution nor the documents scanned at such a high resolution can be displayed using present day display technology.  The hybrid solution provides the best of both worlds. The high-resolution microfilm masters can be safely archived, and retrieved when needed to generate new high-use, highly accessible digital version.  The process also serves to circumvent the problems with digital technology, i.e. constant migration. The life expectancy of microfilm is in the 500+ year range (Jones 1993). The microfilm master, if properly stored, is quite simply the most stable reformatting method available today. Yale’s Open Book Project (1991-96) suggest that in preservation microfilm that produces better quality digital image products but that the costs incurred in creating such film will not be recouped through reduced digital conversion costs.   

6.1 Portico’s Digital Preservation Service

Portico is the largest community-supported digital archives in the world, committed to the preservation of digital publications such as e-journals, e-books, and other digital content. Portico is collaboratively working with libraries, publishers, and funders, to ensure researchers and students will have access to it in the future. Portico offers the easiest way to ensure that the digital content is kept safe. The various approaches of Portico are:
  • Preserves the largest and broadest collection of e-books and e-journals of any third-party preservation archive.
  • Maintains legal agreements with publishers to guarantee long-term preservation.
  • Monitors and communicates information about the abandonment of e-journal and e-book titles.
  • Helps set industry standards and utilizes best practices for e-content validation and authentication, metadata management, and distributed replication.
Portico preserves content through a format-based migration strategy. The key points of this strategy are (i) Identifying key preservation metadata at the initial point of preservation, (ii) Practical preservation of content, such that content is only migrated at the point where it becomes necessary.
To meet the goals of digital preservation (usability, authenticity, discoverability and accessibility), portico follows exact standards and processes for content management and maintenance and replication of the archive; conduct self-checks and third-party archive certifications to ensure quality and security; and maintain a delivery system and services to provide access to users in ways that are as easy and integrated with other online resources as users expect.( http://www.portico.org/digital-preservation/)


6.2 The LOCKSS (Lots of Copies Keep Stuff Safe) Programme

LOCKSS Programme, based at Stanford University Libraries, provides libraries and publishers with award-winning, low-cost, open source digital preservation tools to preserve and provide access to persistent and authoritative digital content. The only approach that mitigates against the broad set of technical, economic and social threats to the security and long-term preservation of digital content. LOCKSS’s award winning open-source technology is built on a peer-to-peer software infrastructure that preserves trustworthy, authoritative and original scholarly content for long-term access. It is a cooperative, affordable and decentralized preservation system over a shared library network that relies on lots of copies to keep stuff safe. (http://www.lockss.org/)

LOCKSS technology preserves the publisher’s original content as of the date of web publication. The “look and feel” of the content, along with the publisher’s branding, is preserved, resulting in an authentic representation of the authoritative source file. The LOCKSS programme also preserves the links going into and out of digital content – links that reference other digital objects and provide valuable context to the original work. The LOCKSS approach takes steps to ensure libraries are responsible not only for short-term access, but involved at many stages in the emerging model of journal archiving. 

6.3 The CLOCKSS (Controlled LOCKSS) Programme

CLOCKSS (Controlled LOCKSS) is a not-for-profit joint venture between the world’s leading academic publishers and research libraries whose mission is to build a sustainable, geographically distributed dark archive with which to ensure the long-term survival of Web-based scholarly publications for the benefit of the greater global research community. Built on low-cost, open-source, award-winning LOCKSS technology, the CLOCKSS archive comprises a network of redundant nodes located at 12 major research libraries, into which e-content is ingested, copied, and preserved. CLOCKSS's decentralized, geographically disparate preservation model ensures that the digital assets of the community will survive intact. ( http://www.clockss.org/)

CLOCKSS is committed to provide very long term preservation solution. It aims become a safe haven for scholarly content from all corners of the world, including underserved scholarly communities and those who cannot afford to archive their materials on their own.

Other selected digital preservation programmes are:


Sl. No
Project
       URLs
Description
1
ADAPT: An approach to digital archiving and preservation technology:
The main aim is developing technologies for building a scalable and reliable infrastructure for the long-term access and preservation of digital assets.


2
DCC : Digital Curation Centre
It was established to help solve the extensive challenges of digital preservation and to provide research, advice and support services to UK Institutions


3
DELOS Digital Preservation Cluster
The main focus [is] on those [tasks] designed to initiate collaborative interaction between institutions and individuals, focus and enable digital preservation and deliver tangible research by bringing together the fragmented research results in different laboratories


4
Digital Object Management (DOM)
the mission is to enable the United Kingdom to preserve and use its digital output forever


5
digital preservation coalition
It was established to foster joint action to address the urgent challenges of securing the preservation of digital resources in the UK and to work with others internationally to secure our global digital memory and knowledge base

6
DORIA: digital object management system
A project that aims to create a platform for preservation, cataloguing and distribution of digital collections


7
the ECHO DEPository project, 2004-7
Its activities include the development of new tools for selecting and capturing materials published on the web, the evaluation of existing tools for storing and accessing digital objects, and research into the challenges of maintaining archived digital resources into the future


8
ERPANET: Electronic Resources Preservation and Access Network
The objective is to establish an expandable European consortium which will make viable and visible information. best practice and skills development in the area of digital preservationo f cultural heritage and scientific objects


9
espida: an effective strategic model for the preservation and disposal of institutional assests
Its focus is on the creation of a model of the relationships roles and responsibilities, costs, benefits and risks inherent in institutional digital preservation


10
GPO LOCKSS Pilot Project
A pilot scheme to make federal government e-journals available to select pilot libraries that are operating LOCKSS boxes


11
Internet Archive
The internet archive is building a digital library of internet sites and other cultural artifacts in digital forum


12
InterPARES 2
In addition to dealing with issues of authenticity it delves into the issues of reliability and accuracy from the perspective of the entire life-cycle of records from creation to permanent preservation.


13
kopal: Cooperative development of a long-term digital
The principal goal is to develop a technological and organizational solutions to ensure the long-term availability of electronic publications


14
LIFE: Life Cycle Information for E-Literature.
LIFE ‘will examine the life cycles of key digital collections at UCL and the British Library and establish the individual stages in the cycle. This stage will then be costed to show the full financial commitment of collecting digital materials over the long term’


15
Mandate: Managing Digital Assets in tertiary Education
The aim is to develop a toolkit to support the creation and implementation of digital asset management and preservation in the further education setting and demonstrate its applicable.


16
MetaArchive
A process to ‘develop a cooperative for the preservation of at-risk digital content with a particular content focus: the culture and history of the American South’ and testing LOCKSS as the technology infrastructure.


17
National Archive
The Digital Preservation department is ‘Playing an active role in storing and preserving digital material’ for government departments and the public sector.


18
New Zealand Trusted Digital Repository
To establish ‘a trusted digital repository for the long-term preservation and maintenance of digital materials aimed at providing New Zealand access to their digital heritage’.


19
North Carolina Geospatial Data Archiving Project (NCGDAP)
The focus is ‘on collection and preservation of digital geospatial data resources from state and local government agencies in North Carolina;.


20
OCLC Digital Archive
A system that ‘offers real-world solutions for the challenges of archiving and preservation in the virtual world’.


21
Our Digital Island
It ‘provides access to Tasmanian web sites that have been preserved for posterity by the State Library of Tasmania’.


22
PRESERV: Preservation Eprint SERVices
PRESERV is a ‘project investing and developing infrastructural digital preservation services for institutional repositories’.


23
Preserving Digital Public Television
The principal aim is to design a ‘preservation repository that the [American] public television system can afford to maintain and use’.


24
PrestoSpace: Preservation towards Storage and Access: Standardised Practices for Audiovisual Contents in Europe
The ‘objective is to provide technical solutions and integrated systems for digital preservation of all types of audiovisual collections. The project intends to provide tangible results in the domain of preservation restoration storage and archive management, content description, delivery and access’.


25
reUSE
The scheme will ‘focus on the publications of public sector institutions. Together with the printed material the digital originals will be collected, preserved and made available’.


26
SHERPA DP: Creating a Persistent Preservation Environment for Institutional Repositories
The objective is to ‘create a collaborative, shared preservation environment for the SHERPA project framed around the OAIS reference model’.


27
Sound Directions: Digital Preservation and Access for Global Audio Heritage
An archiving project to ‘create best practices and test emerging standards for digital preservation of archival audio’.


28
Sun Centre of Excellence for Digital Futures in Libraries
The aim is to ‘develop an advanced information lifecycle management system, which will serve as an international model for digital repositories and preservation management’.


29
Tufts and Yale: Fedora and the preservation of university Records
‘To synthesize electronic records preservation research with digital library repository research in an effort to develop systems capable of preserving university electronic records at both institutions, this project will test the potential of fedora (the Flexible Extensible Digital Object and Repository Architecture) to serve as the architecture for such an electronic records preservation system’.


30
Virtual Archives Laboratory (VAL)
The objective is ‘to design and test a model for a Federated Persistent Archives that will examine and address requirements for large-scale long-term preservation of electronic records’


31
The Web at Risk: A Distributed Approach to Preserving Our National’s Political Cultural Heritage
The aim s to ‘develop web archiving tools that will be used by libraries to capture, curate, and preserve collections of web-based government and political information’.





7. Digital Preservation Initiatives in India

Centre of Excellence for Digital Preservation is the project under the National Digital Preservation Programme of Department of Electronics & Information Technology (DeitY), Government of India. This project is designed and maintained, C-DAC Pune, India. Objectives of this project are as under:

  • Conduct research and development in digital preservation to produce the required tools, technologies, guidelines and best practices.

  • Develop the pilot digital preservation repositories and provide help in nurturing the network of Trustworthy Digital Repositories (national digital preservation infrastructure) as a long-term goal.

  • Define the digital preservation standards by involving the experts from stakeholder organizations, consolidate and disseminate the digital preservation best practices generated through various projects under National Digital Preservation Programme, being the nodal point for pan-India digital preservation initiatives.

  • Provide inputs to Department of Electronics & Information Technology in the formation of national digital preservation policy and strategy by identifying and selecting the activities for the National Digital Preservation Programme.

  • Spread awareness about the potential threats and risks due to digital obsolescence and the digital preservation best practices. 

8. Summary

The module discusses various metadata specifications like PREMIS or METS, have the status of de facto standards with well-defined community processes for maintaining and updating digital assets for long-term preservation. While the Open Archival Information System (OAIS) reference model defines a framework with a common vocabulary and provides a functional and information model for the preservation community. The METS schema provides a flexible mechanism for encoding descriptive, administrative, and structural metadata for a digital library object, and for expressing the complex links between these various forms of metadata. The major functions of a Preservation Programmes, govern by UNESCO’s Guidelines for the Preservation of Digital Heritage (2003), and how effective storage management is possible through media level formatting of information objects are well discussed in the module. Other selected Digital Preservation Programmes in India and world-wide are discussed at end of the module.

References

Arora, Jagdish (2004). Building digital libraries: An overview. DESIDOC Bulletin of Information Technology, 21(6).

Centre of Excellence for Digital Preservation (National Digital Preservation Programme) (http://www.ndpp.in/ )

Chapman, S., Conway, P., & Kenney, A. R. (1999). Digital imaging and preservation microfilm: the future of the hybrid approach for the preservation of brittle books. Council on Library and Information Resources. Available online athttp://www.clir.org/pubs/archives/hybrid.pdf

Conway, P. (1996). Conversion of microfilm to digital imagery: A Demonstration Project (New Haven, CT: Yale University Library.

Conway, Paul. 1996. Selecting microfilm for digital preservation: A case study from Project Open Book. Library Resources and Technical Services. 40, no 1 : 67-77.

Cundiff, M. V. (2004). An introduction to the metadata encoding and transmission standard (METS). Library Hi Tech, 22(1), 52-64.

Dappert, A., & Enders, M. (2010). Digital Preservation Metadata Standards. Information Standards Quarterl(ISQ), 22(2). Also available online at
http://www.loc.gov/standards/premis/FE_Dappert_Enders_MetadataStds_isqv22no2.pdf

Data dictionary for preservation metadata: final report of the PREMIS Working Group. OCLC, 2005.
Davis, Eric T.(1997). An overview of the access and preservation capabilities in digital technology. Available online at http://www.frontierfamilies.net/family/diglib/home.html.

Day, M. (2005). DCC Digital Curation Manual: Instalment on Metadata.  Available online athttp://www.dcc.ac.uk/sites/default/files/documents/resource/curation-manual/chapters/preservation-metadata/preservation-metadata.pdf

Hoy, M. (2007). Record-keeping competency standards: The Australian Scene 1. Journal of the Society of Archivists, 28(1), 47-65.

Jones, C. Lee. 1993. Preservation film: Platform for digital access systems. Commission on Preservation and Access. Washington DC : 1-3
Lavoie, Brian and Gartner, Richard (2005). Preservation metadata. Digital Preservation Coalition. (DPC Technology Watch Series Report.

Lavoie, Brian and Gartner, Richard (2013). Preservation Metadata (Second Edition. DPC Technology Watch Report.

Lee, Christopher A.(2010). Archival Information System (OAIS) Reference Model. Encyclopedia of Library and Information Sciences, 4020-4030. Available online athttp://www.ils.unc.edu/callee/p4020-lee.pdf
METS: An Overview & Tutorial. Available online athttp://www.loc.gov/standards/mets/METSOverview.v2.html

National Archives of Australia (2008). Digital record keeping guidelines: Preserving digital records for the long term. Available online at:http://www.naa.gov.au/recordkeeping/er/guidelines/10-preservation.html

Sayer, Donald, et al (2001). The Open Archival Information System (OAIS) Reference Model and its sage.  Available online at:http://public.ccsds.org/publications/documents/SO2002/SPACEOPS02_P_T5_39.PDF

UNESCO’s Guidelines for the Preservation of Digital Heritage (2003). Available online athttp://unesdoc.unesco.org/images/0013/001300/130071e.pdf

Willis, Don (1992). A Hybrid Systems Approach to Preservation of Printed Materials.
Wikipedia, Preservation Metadata (Last visited on 2ndMarch, 2014) (http://en.wikipedia.org/wiki/Preservation_metadata)
Zarro, M. (2007). PREMIS: PREservation Metadata: Implementation Strategies.

Zeng, M. L., & Chan, L. M. (2006). Metadata interoperability and standardization-A study of methodology,  Part II.  D-Lib Magazine12(6), 1082-9873. Also available at:http://mirror.dlib.org/dlib/june06/zeng/06zeng.html

Portico (http://www.portico.org/) (Last visited March 5, 2014)
LOCKSS (http://www.lockss.org/about/what-is-lockss/)(Last visited March 5, 2014)
CLOCKSS(http://www.clockss.org/) (Last visited March 5, 2014)

Glossary

Access: Ongoing usability of a digital resource, retaining all qualities of authenticity, accuracy and functionality deemed to be essential for the purposes the digital material was created and/or acquired for.

Access System Redundancy: When an entire system is running over two or more computers in two or more data centers. This is an excellent way to ensure that there is little interruption to near-term, ongoing access, but it does not alone guarantee usability, authenticity, or accessibility of the content over the long-term.

AIP (Archival Information Package): In the OAIS conceptual model, a collection (package) of content and preservation description information which is preserved in an OAIS-compliant archive.

Archive: i) An organisation whose function is the preservation of resources, either for a specific community of users, or for the general good; ii) the collection of resources so preserved.

Backup: Content is copied and stored in multiple locations. A well-managed backup system can help to quickly resolve problems with content encountered this week, or next week, or next month, but on its own is insufficient over the long-term.

Byte Replication: A process whereby identical, multiple copies of files, file systems, or websites are created. However, simple byte replication includes no provision to ensure the content is usable when the file formats are no longer current, nor is there any inherent provision to ensure that the content remains discoverable.

CLOCKSS: A not-for-profit joint venture started by libraries and publishers committed to ensuring long-term access to scholarly publications in digital format. Built on low-cost, open-source, award-winning LOCKSS technology, CLOCKSS's decentralized, geographically disparate preservation model ensures that the digital assets of the community will survive intact.

Digital preservation: The processes of maintaining accessibility of digital objects over time.

DIP (Dissemination Information Package): In the OAIS conceptual model, a collection (package) of content and preservation description information which is delivered to the end user from an OAIS-compliant archive.

DIP (Dissemination Information Package): In the OAIS conceptual model, a collection (package) of content and preservation description information which is delivered to the end user from an OAIS-compliant archive.

Hierarchical Storage Mechanisms (HSM): A data storage mechanism where the most frequently used data is kept on fast disks while less frequently used data is kept in nearline such as an automated (robotic) tape library. An HSM can automatically migrate data from tape to disk and vice-versa as required.

Ingest:  The process of loading a digital file into a digital repository along with its descriptive metadata for subsequent retrieval is referred to as ingest.

Liaison and Advocacy: The preservation programme and libraries must advocate good practices among producers of digital contents with an aim to facilitate long-term availability of the material for which the programme will be responsible.

Lockss:  An open-source, library-led digital preservation system built on the principle that “lots of copies keep stuff safe.”

Metadata: A set of data that describes and gives information about other data.

METS (Metadata Encoding and Transmission Standard): An XML schema for packaging digital object metadata.

Microfilming: A type of photographic process that used to produce reduced size images of textual or graphic material on film. In this process a master is produced from which further copies can be made.

OAIS (Open Archival Information System): An archival model that has accepted responsibility to preserve data and make it available to designated communities.

OCLC: A nonprofit, membership, computer library service and research organization dedicated to the public purposes of furthering access to the world’s information and reducing library costs.

Pandora:  National web archive for the preservation of Australia's online publications. It was established by the National Library of Australia in 1996, and is now built in collaboration with Australian state libraries and cultural collecting organisations, including the Australian Institute of Aboriginal and Torres Strait Islander Studies, the Australian War Memorial, and the National Film and Sound Archive.

Portico:  A digital preservation service provided by ITHAKA, a not-for-profit organization with a mission to help the academic community use digital technologies to preserve the scholarly record and to advance research and teaching in sustainable ways.

Preservation metadata: Metadata intended to support preservation management of digital materials, by documenting their identity, technical characteristics, means of access, responsibility, history, context, history and preservation objectives.

SIP (Submission Information Package): In the OAIS conceptual model, a collection (package) of content and preservation description information which is submitted to an OAIS-compliant archive.

Technology Preservation: A digital preservation strategy wherein digital data are stored at bit streams on a stable digital medium (and refreshed to new media as required) and associated with that object are preserved copies of the original application software, the operating system that this would normally run under and the relevant hardware platform.
XML (eXtensible Markup Language): A widely-used application-independent markup language for encoding data and metadata.




No comments: