Friday, November 15, 2013

03 Major Components of Digital Library

इस ब्लॉग्स को सृजन करने में आप सभी से सादर सुझाव आमंत्रित हैं , कृपया अपने सुझाव और प्रविष्टियाँ प्रेषित करे , इसका संपूर्ण कार्य क्षेत्र विश्व ज्ञान समुदाय हैं , जो सभी प्रतियोगियों के कॅरिअर निर्माण महत्त्वपूर्ण योगदान देगा ,आप अपने सुझाव इस मेल पत्ते पर भेज सकते हैं - chandrashekhar.malav@yahoo.com

A

Advance Search
A systematic effort on the part of a library user or librarian to locate desired information by manual or electronic means, whether successful or not, as opposed to browsing a library collection casually with no clear intention in mind. See also: mediated search, search statement, search strategy, and serendipity.
Also refers to an attempt by a member of the circulation staff of a library, sometimes at the request of a patron, to find an item listed as available in the catalog but not in its correct location on the shelf. See also: missing.

In employment, the formal process of seeking qualified candidates to fill a vacant position, often undertaken by a search committee composed of staff members and/or supervisors who will work closely with the new employee. In libraries, national searches are usually announced in professional publications, such as American Libraries, College & Research Libraries News, and the Chronicle of Higher Education.
Authentication
In online systems, the procedure for verifying the integrity of a transmitted message. Also, a security procedure designed to verify that the authorization code entered by a user to gain access to a network or system is valid. See also: biometric ID, password, PIN, and username.

In archives, the process of verifying, usually through careful investigation and research, whether a document or its reproduction is what it appears or claims to be. The final judgment is based on internal and external evidence, including the item's physical characteristics, structure, content, and context. Compare with certification.
Authorization
In computing, a username, password, PIN, or other access code issued to a person who is permitted to access a specific electronic resource, application program, network, or other computer system that must be entered correctly by the user in order to log on. Authorization codes are usually subject to periodic renewal. A single authentication may have multiple authorizations.

C

Client
A person who uses the services of a professionally trained expert, or of a professional organization or institution, usually in exchange for payment of a fee. Librarians employed in academic and public libraries usually refer to the people they serve as users or patrons because libraries have traditionally provided most services without charge. Information brokers who operate on a fee-for-service basis can be more appropriately said to serve "clients."

Also refers to a computer connected to a network such as the Internet, equipped with software enabling the user to access resources available on another computer, called a server, connected to the same network.

D

Database Management System (DBMS)
A computer application designed to control the storage, retrieval, security, integrity, and reporting of data in the form of uniform records organized in a large searchable file called a database. The range of available DBMS software extends from simple systems intended for personal computers to highly complex systems designed to run on mainframes.
Digital Object
In the technical sense, a type of data structure consisting of digital content, a unique identifier for the content (called a "handle" ), and other data about the content, for example, rights metadata. See also: digital asset management and Digital Object Identifier.
Digital Object Identifier (DOI)
A unique code preferred by publishers in the identification and exchange of the content of a digital object, such as a journal article, Web document, or other item of intellectual property. The DOI consists of two parts: a prefix assigned to each publisher by the administrative DOI agency and a suffix assigned by the publisher that may be any code the publisher chooses. DOIs and their corresponding URLs are registered in a central DOI directory that functions as a routing system.
The DOI is persistent, meaning that the identification of a digital object does not change even if ownership of or rights in the entity are transferred. It is also actionable, meaning that clicking on it in a Web browser display will redirect the user to the content. The DOI is also interoperable, designed to function in past, present, and future digital technologies. The registration and resolver system for the DOI is run by the International DOI Foundation (IDF). CrossRef is a collaborative citation linking service that uses the DOI.
Digital Rights Management (DRM)
A system of hardware and software components and services, designed to distribute and control the rights to intellectual property created or reproduced in digital form for distribution online or via other digital media, in conjunction with corresponding law, policy, and business models. DRM systems typically use data encryption, digital watermarks, user plug-ins, and other methods to prevent content from being distributed in violation of copyright.
Unfortunately for consumers and libraries, "quick fix" DRM solutions often fail to distinguish between copyright piracy and fair use, may undermine the first sale provision of U.S. copyright law, and can be draconian. For example, many e-book editions completely forbid copying, even for works in the public domain. Carrie Russell, copyright specialist for the American Library Association (ALA), also contends that some DRM solutions threaten "to reduce the functionality of consumer and library electronic equipment, including desktop computers" (Library Journal, August 2003). Click here to learn more about DRM, courtesy of Wikipedia. The Electronic Privacy Information Center (EPIC) provides a Web page on Digital Rights Management and Privacy.
Digital Watermarking
A faintly translucent papermaker's mark, consisting of lettering and/or an emblematic design that can be seen faintly in a sheet of quality paper when it is held up to a light source (see this example). In hand papermaking, the design is made by sewing or soldering twisted wire to the mold, causing the layer of moist fiber to be thinner over the wire. In mechanized papermaking, the wire is impressed on the moist fiber by a cylinder called the dandy roll before the sheet is sent through a sequence of drying rolls.
Watermarks were originally intended to identify and date the source of production but in time came to designate paper size. Modern watermarks are sometimes used to provide security against forgery. The paper used in a deluxe edition may be watermarked to indicate that it was made especially for the edition. Click here to learn more about watermarks, courtesy of David Badke. To see images of this elusive form of pictorial art, try a search on the term "unicorn" in Watermarks in Incunabula Printed in the Low Countries, a database maintained by the Koninklijke Bibliotheek. Synonymous with papermark. See also: countermark.

In word processing, a design or lettering printed in a shade of gray across a page, over which the text appears to be superimposed, for example, the word "Draft" to indicate that the text is not the final version. A digital watermark is a sequence of bits skillfully embedded in a data file, such as an audio CD or motion picture on DVD, to help identify the source of copies manufactured or distributed in violation of copyright.
Domain Name System (DNS)
A distributed Internet directory service used primarily to translate numerical IP addresses (example: 123.456.78.9) into the alphanumeric domain name addresses (www.thisuniversity.edu) familiar to Internet users, and vice versa. The DNS is administered by the Internet Corporation for Assigned Names and Numbers (ICANN).

E

Electronic Book
A digital version of a traditional print book designed to be read on a personal computer or e-book reader. Although the first hypertext novel was published in 1987 (Afternoon, A Story by Michael Joyce), electronic books did not capture public attention until the online publication of Stephen King's novella Riding the Bullet in March 2000. Within 24 hours, the text had been downloaded by 400,000 computer users. Some libraries offer access to electronic books through the online catalog. A universally accepted format and simple delivery system are needed.
Electronic Document Delivery
The transfer of information traditionally recorded in a physical medium (print, videotape, sound recording, etc.) to the user electronically, usually via e-mail or the World Wide Web. Libraries employ digital technology to deliver the information contained in documents and files placed on reserve and requested via interlibrary loan.
Electronic Journal
A digital version of a print journal, or a journal-like electronic publication with no print counterpart (example: EJournal), made available via the Web, e-mail, or other means of Internet access. Some Web-based electronic journals are graphically modeled on the print version. The rising cost of print journal subscriptions has led many academic libraries to explore electronic alternatives. Directories of electronic journals are available online (example: Ejournal SiteGuide: a MetaSource maintained by the University of British Columbia Library). Synonymous with e-journal. Compare with electronic magazine.
Electronic Resource
Material consisting of data and/or computer program(s) encoded for reading and manipulation by a computer, by the use of a peripheral device directly connected to the computer, such as a CD-ROM drive, or remotely via a network, such as the Internet (AACR2). The category includes software applications, electronic texts, bibliographic databases, institutional repositories, Web sites, e-books, collections of e-journals, etc. Electronic resources not publicly available free of charge usually require licensing and authentication.
Electronic Theses And Dissertations (ETD)
Master's theses and Ph.D. dissertations submitted in digital form rather than in print on paper, as opposed to those submitted in hard copy and subsequently converted to machine-readable format, usually by scanning. Forty universities in the United States and over 100 institutions worldwide currently participate in the Networked Digital Library of Theses and Dissertations (NDLTD), an initiative to require that all theses and dissertations be submitted in electronic format.
Encyclopedia
A book or numbered set of books containing authoritative summary information about a variety of topics in the form of short essays, usually arranged alphabetically by headword or classified in some manner. An entry may be signed or unsigned, with or without illustration or a list of references for further reading. Headwords and text are usually revised periodically for publication in a new edition. In a multivolume encyclopedia, any indexes are usually located at the end of the last volume. Encyclopedias may be general (example: Encyclopedia Americana) or specialized, usually by subject (Encyclopedia of Bad Taste) or discipline (Encyclopedia of Social Work). In electronic publishing, encyclopedias were one of the first formats to include multimedia and interactive elements (example: Grolier Multimedia Encyclopedia Online). The modern encyclopedia began with the 21-volume Encyclopédie edited by Denis Diderot and Jean d'Alembert, an expression of the rationalism of the 18th-century Enlightenment (Cornell University Library)

F

Full-text Search
A search of a bibliographic database in which the entire text of each record or document is searched and the entry retrieved if the terms included in the search statementare present. Most Web search engines are designed to perform full-text searches. This can pose a problem for the user if a search term has more than one meaning, resulting in the retrieval of irrelevant information (false drops). For example, in a medical database, the query "treatment of AIDS" might retrieve entries for sources containing the phrase "treatment aids in geriatrics" (with "of" a stopword). Compare with free-text search.

L

Library Portal
Software that allows a computer user to customize online access to collections of information resources by creating a list of Internet connections, much like a personalized directory of street addresses and telephone/fax numbers (example: MyLibrary). Library portals are designed to reduce information overload by allowing patrons to select only the resources they wish to display on their personal interface.

M

Metadata
Literally, "data about data." Structured information describing information resources/objects for a variety of purposes. Although AACR2/MARC cataloging is formally metadata, the term is generally used in the library community for nontraditional schemes such as the Dublin Core Metadata Element Set, the VRA Core Categories, and the Encoded Archival Description (EAD). Metadata has been categorized as descriptive, structural, and administrative. Descriptive metadata facilitates indexing, discovery, identification, and selection. Structural metadata describes the internal structure of complex information resources. Administrative metadata aids in the management of resources and may include rights management metadata, preservation metadata, and technical metadata describing the physical characteristics of a resource.

O

OPAC
An acronym for online public access catalog, a database composed of bibliographic records describing the books and other materials owned by a library or library system, accessible via public terminals or workstations usually concentrated near the reference desk to make it easy for users to request the assistance of a trained reference librarian. Most online catalogs are searchable by author, title, subject, and keywords and allow users to print, download, or export records to an e-mail account. Compare with WebPac
OpenURL
A framework and format for communicating bibliographic information between applications over the Internet. The information provider assigns an OpenURL to an Internet resource, instead of a traditional URL. When the user clicks on a link to the resource, the OpenURL is sent to a context-sensitive link resolution system that resolves the OpenURL to an electronic copy of the resource appropriate for the user (and potentially to a set of services associated with the resource). The OpenURL shows promise of becoming an important tool in the interoperation of distributed digital library systems and has the potential to change the nature of linking on the Web.
The OpenURL was conceived at the University of Ghent by Herbert Van de Sompel and Patrick Hochstenbach, and by Oren Beit-Arie of the Ex Libris library automation company, who built a resolution system called SFX, now licensed to Ex Libris. SFX is being used by NISO to draft a U.S. national standard for OpenURL that will be compatible with other standards such as MARC 21, Dublin Core, Online Information Exchange (ONIX), and the Open Archives Initiative (OAI).

P


Patent
A legal document issued by the U.S. government, or the government of another country, in response to a formal application process in which the inventor or originator of a new product or process is granted the exclusive right to manufacture, use, and sell it for a designated period of time. The document is assigned a patent number by the patent office for future reference. An x-patent is a patent issued by the U.S. Patent and Trademark Office (USPTO) from July 1790 (when the first U.S. patent was issued) to July 1836. Destroyed by a fire in December 1836 while in temporary storage, the collection of over 9,900 early patents has been reconstructed from inventors' copies. Most large engineering libraries provide patent search databases and services. Click here to learn more about how patents work and here to learn more about U.S. patent law (Legal Information Institute, Cornell University), or try the U.S. Patent and Trademark Office site. The Canadian government provides the Canadian Patents Database. Compare with trademark.
Pathfinder
A subject bibliography designed to lead the user through the process of researching a specific topic, or any topic in a given field or discipline, usually in a systematic, step-by-step way, making use of the best finding tools the library has to offer. Pathfinders may be printed or available online.
Persistent URL (PURL)
A type of URL (Uniform Resource Locator) that does not point directly to the location of an Internet resource, but rather to an intermediate resolution service (PURL server) that associates the stable PURL with the actual URL, and returns the URL to the client, which then processes the request in the usual manner. PURLs were developed through OCLC participation in the Internet Engineering Task Force (IETF) Uniform Resource Identifier working groups as an interim solution to the problem posed by URL changes (lack of persistence) in the MARC description of Internet resources. They are an intermediate step on the path to URNs (Universal Resource Names) in Internet information architecture.
Protocol
From the Greek protos ("first in time" ) and kolla ("glue" ). In Antiquity, the protokollon was the first sheet of a papyrus roll, usually bearing the official mark of the manufacturer and giving a description of the contents of the manuscript. In modern usage, an original draft of a document. Also, a formal or official statement recording the details of a transaction or proceeding, or a similar record of the procedures and results of a scientific experiment or medical treatment.
In electronic communications, a set of formal conventions for the exchange of data between workstations connected to a computer network, including the rules governing data format and control of input, transmission, and output. Data transmission over the Internet is governed by the TCP/IP protocol implemented in 1982, which allows users of different types of computers to communicate seamlessly. The six main protocols used in Internet addresses (URLs) are:

ftp:// - FTP directory of downloadable data or program files
gopher:// - Gopher server
http:// - Hypertext document on the World Wide Web
mailto: - Electronic mail (e-mail)
news: - Usenet newsgroup
telnet:// - Application program running on a remote host

R

Rare Book
A book so difficult to find that only a few copies are known to antiquarian booksellers. Those that do exist seldom appear on the market and are consequently coveted. Rare books are often valuable, but not all highly valuable books are rare. Most libraries keep their rare books in a secure location to which access is restricted (usually in special collections). Very rare books are sold at book auctions and by dealers serving collectors (see for example the Philadelphia Rare Books & Manuscripts Company). For a detailed discussion of the history of rare book libraries, see the entry by Daniel Traister in the International Encyclopedia of Information and Library Science (Routledge, 2003). Many libaries are digitizing images from their rare books collections (see this exhibition by the Missouri Botanical Garden Library). Rules for cataloging rare books are given in Descriptive Cataloging of Rare Books, 2nd edition (Library of Congress Cataloging Distribution Service, 1991). The Rare Book School at the University of Virginia is the only one if its kind in the United States.
RSS
Originally, an abbreviation of RDF Site Summary, renamed Rich Site Summary, and later redubbed Really Simple Syndication. A method of Web syndication, originally developed at Netscape, which uses XML file formats to publish frequently updated online works, such as blog entries, news headlines, and audio and video clips, instandardized format (see this example). An RSS document (called a feed or web feed) includes full or summarized text with limited metadata (usually publication dateand author). The user subscribes to a feed by entering its URL into an RSS reader or by clicking a feed icon in a Web browser to initiate the subscription process. The RSS reader automatically checks the user's subscribed feeds on a regular basis, downloads any updates, aggregates them, and provides a user interface, enabling the subscriber to monitor and read new feeds at will without visiting multiple Web sites. RSS readers are available for different platforms, or the user can select a Web-based reader.

S

Semantic Web
Sir Timothy Berners-Lee (1999): "I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web--the content, links, and transactions between people and computers. A Semantic Web, which should make this possible, has yet to emerge, but when it does, the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines. The intelligent agents people have touted for ages will finally materialize."
Server
A host computer on a network, programmed to answer requests to download data or program files, received from client computers connected to the same network. Also refers to the software that makes serving clients possible over a network. Servers are classified by the functions they perform (application server, database server, faxserver, file server, intranet server, mail server, proxy server, terminal server, Web server, etc.).

V

Virtual Library
A "library without walls" in which the collections do not exist on paper, microform, or other tangible form at a physical location but are electronically accessible in digital format via computer networks. Such libraries exist only on a very limited scale, but in most traditional print-based libraries in the United States, catalogs and periodical indexes are available online, and some periodicals and reference works may be available in electronic full-text. Some libraries and library systems call themselves "virtual" because they offer online services (example: Colorado Virtual Library).

The term digital library is more appropriate because virtual (borrowed from "virtual reality"wink suggests that the experience of using such a library is not the same as the "real" thing when in fact the experience of reading or viewing a document on a computer screen may be qualitatively different from reading the same publication in print, but the information content is the same regardless of format.

W

Web server
A system capable of providing Internet access to Web-based resources and services in response to requests from client computers on which Web browser software is installed. A Web server includes the necessary hardware and also the operating system, TCP/IP protocols, server software, and information content of the Web sites installed on it. Web server software is designed to accept requests from users to download HTML text, image, and audio files. Click here to learn more about Web servers, courtesy of HowStuffWorks.
..........................................................................................................................................................................

Objectives

Digital libraries are collection of distinct systems and resources that are not available off-the-shelf as packaged solution. The task of building a digital library, therefore, requires a great deal of integration of various components. The objectives of this section is to discuss and impart knowledge on the following major components of digital library: 

  • Collection Infrastructure including collection development, management and sourcing digital content;

  • Digital Knowledge Organization including metadata and its role in browse, search and navigation, object naming and addressing, unique object identifiers, etc.;

  • Access Infrastructure including search, browse and navigation interfaces and subject gateways;

  • Computers and Network Infrastructure including server-side hardware components, server-side software components, client-side hardware & software components;

  • Intellectual Property Rights (IPR) and Digital Rights Management including IPR issues in digital world and technology used for access control in digital library; and

  • Digital Library Services including E-mail alerts, RSS feeds or Atom feeds, Ask-an-Expert, electronic document delivery services, Web-based user education, digital reference service, real-time reference service, my settings, my saved searches and my saved articles.
  • 1. Introduction

    Establishing digital library resources and services require a great deal of infrastructural components that are not available off-the-shelf as packaged solution. There are no turn-key, monolithic systems available for digital libraries, instead digital libraries are collection of disparate systems and resources connected through a network and made interoperable using open system architecture and open protocol and are integrated within one interface, currently the web interface. Use of open architecture and open standards make it possible that pieces of required infrastructure, be it hardware, software or accessories, are gathered from different vendors in the marketplace and integrated to construct a working digital library environment. Several components required for establishing digital library are internal to the institutions, but several others are distributed across the Internet, owned and controlled by a large number of independent players.  The task of building a digital library, therefore, requires a great deal of integration of various components (Flecker, D., 2001). Major components required for a digital library can broadly be divided into six major categories mentioned above and depicted in Figure 1.
    These components are described briefly in this module. However, separate modules are devoted to impart detailed information on each of the six components of digital library mentioned above.

    Fig.1: Components of Digital Library
  • 2. Collections Infrastructure

    The most important component of a digital library is the digital collection it holds or has access to. Viability and extent of usefulness of a digital library depends upon the critical mass of digital collection it has. The collection infrastructure typically consists of two components, i.e. metadata and digital objects that a digital library holds. The metadata provides bibliographic or index information for the digital objects. While digital objects are the primary documents that users are interested to access, it is metadata that facilitates their identification, retrieval and location using variety of search techniques. Information content of a digital library, depending on the media type it contain, may include a combination of structured / unstructured text, numerical data, scanned images, graphics, audio and video recordings and other multimedia content. Different types of resources need to be handled differently in a digital library. 

    The libraries, irrespective of media types  that they house, i.e. print, audio-visual or digital, are primarily responsible for identifying, selecting, organizing, preserving and providing access to diverse categories of resources to their users. The transition from traditional library to digital library cannot happen overnight in a single step, rather this   transition is gradual and incremental in nature. As such, the traditional libraries are not becoming digital libraries, but are increasingly acquiring access to ever growing digital collections for their users either by licensing of e-resources available in the market place or by its acquisition on one-time purchase and perpetual access basis. Collections in digital libraries may also consists of datasets that are “borne digital” or existing printed documents converted into digital format through scanning. Creating vitual libraries, library portals or subject gateways are also considered as important digital library collection. Collection management in a digital or hybrid library need to have pre-defined policies and practicies similar to those being followed in traditional library while keeping in view the issues and complexities that are specially related to digital materials.

    The current electronic publishing market consists of  traditional players such as commercial publishers, scholarly societies, university presses offering electronic versions of their print journals as well as several new enterprises offering new products and services that are “borne digital”. The market also has several aggregators that provides electronic resources in a given disciplines sourced from different publishers. These publishers offer  a variety of electronic resources including electronic journals, electronic books, conferences proceedings, online courseware, learning materials, tutorials, guides, manuals, patents, standards, electronic e-prints (preprints and postprints), technical reports, electronic theses and dissertations, online databases and databanks, dictionaries, encyclopaedia, subject portals or pathfinders. Major publishers, besides offering their electronic journals are now offering electronic books either directly through their Web sites or in partnership with other publishers or through aggregators like e-brary, NetLibrary, Questia, 24x7, Knovel, etc. Moreover, more than 32,000 books are available free of cost through Project Gutenberg. These electronic resources are available on variable pricing model.

    A separate module is devoted to collection development, selection, acquisition, licensing and management of digital resources.
  • 3. Digital Knowledge Organization

    Traditional library consists of physical objects such as books, journals, conference documents, standards, patents, video, microfilms and CDs that are organized into various collections such as Text Books, General Books, Reference Books, Rare Books, Audio-visuals, CD ROM Collections and Journals. Each collection is further organized using classification schemes such as Dewey Decimal Classification, Library of Congress Classification, Universal Decimal Classification, Colon Classification, etc. so as to bring books on same subject together and facilitate browsing documents on the shelves. Moreover, each book is catalogued and assigned subject headings using standard subject headings and thesauri like Library of Congress Subject Headings (LCSH), Medical Subject Headings  (MeSH), Sear’s Subject Headings, etc. so as to facilitate their retrieval using Library OPAC. While physical libraries are organized at physical level, i.e. books, journals, theses, reports, reference books, textbooks, etc., digital libraries are organized at digital objects level which may include a combination of structured / unstructured text, numeric data, scanned images, graphics, articles in a journal or chapters in a book and other multimedia objects.

    A disc full of digital objects without any organization, browse, search and navigation options would be completely useless and meaningless since these digital objects need to be organized and made accessible to the user community. An effective and efficient access mechanism that allow a user to browse, search and navigate digital resources becomes necessary as electronics resources of a collection grow in number and complexion. As digital libraries are built around Web and Internet Technology, it uses object and addressing protocols of the Internet. The process of organizing digital objects includes: i) development of metadata schema; ii) assigning different kind and levels of metadata to each digital object; iii) assigning Unique Object Identifiers to each digital object; iv) linking digital objects with associated metadata to facilitate their browsing, searching and navigation; and v) organizing digital objects and associated metadata into a database; and vi) building browse, search and navigation interfaces.

    A separate module is devoted to digital knowledge organization that would deliberate on the issues mentioned above.
  • 4. Access Infrastructure: Browse, Search and Navigation Interfaces of Digital Library

    An effective and efficient access mechanism that allows a user to browse, search and navigate digital resources becomes necessary as electronics resources of a collection grow in number and complexion. While the access infrastructure for a traditional library is OPAC / WebPAC (including journals holding), the access infrastructure for digital libraries consists of browse, search and navigational interfaces for individual digital libraries, specialized indices for specialized local collections, portals or subject gateways for web resources and an integrated interface for all e-resources accessible to a given library including library OPAC.
  • 4.1. Search, Browsing and Navigational Interfaces


    The users interact with the digital library using its search interface which typically support browsing, searching and navigation. The search interface provides a visual window for users to search and browse relevant information stored in a digital resource and to display it.

    Most digital libraries support searching with varying degree of capabilities ranging from “simple search” to “advanced search”. In the simple search mode, a user is required to enter his or her “query” in the search box. In the advanced search mode, a user can use Boolean queries, wild cards, phrase searches and field-specific searches. Many digital libraries also support relevant-ranking of search results, based on the relevance score of the retrieved documents. A typical digital library implementation may employ a variety of  information retrieval techniques including metadata searching, full-text document searching and content search or combination or two or all of them. Digital libraries consisting of images also support image-based searching based on the names of objects appearing in the images. Digital libraries built around Geographical Information Systems (GIS) with geo-spatial data, support retrieval of relevant portion of maps which can be zoomed-in or zoomed-out (Example: National Geographic Map Machine:  http://plasma.nationalgeographic.com/mapmachine/).

    Information retrieval in a digital library is made more effective and user’s-friendly by preprocessing digital documents to extract additional metadata before storing them in a database. The database is then configured to generate indices from selected fields including authors, titles, abstracts, etc. or it may also be configured to generate indices from the full-text articles with a pre-defined stop-word list. Depending upon the implementation of digital libraries, the search conducted may be restricted to a single server or several servers geographically dispersed at distant location. Digital libraries also support “federated searches” wherein the search query is sent to search systems on different servers and results received from different servers are merged and presented to the user. Typical example of “federated searches” is Networked Digital Libraries of Theses and dissertations (NDLTD) project (http://www.ndltd.org/).

    Besides search interface, a browsing interface is a necessity for a digital library to give a user a sense of the amount and variety of material and the attributes of these materials available in the digital library. Browsing helps a user to learn about the collection in general, topics covered and kinds of material available in a digital collection  (Marchionini, 1998). The browsing interface of  digital library generally consists of combination of hierarchical menu and selection buttons where the interface guides the user, starting from the top-level subject category through a series of progressively narrowing levels within the category for a user to select and retrieve associated digital objects from the digital library.  Browsing interface for a full-text library, for example, may consist of research articles arranged alphabetically by  i) author’s name, article title and year of publications as a selectable criteria; and ii) hierarchical presentation of research articles under subject categories. Most digital library support browsing facility through the table of contents which are linked to their full-text or to the specific chapters and sections. 

    Digital libraries not only consist of multitude of resources but also multitude mechanism to access these resources. A number of standards and technologies are now available that enables interoperability and cross-searching of digital libraries. Two approaches that are used for implementing cross-searching of multiple heterogeneous digital repositories includes i) metadata harvesting approach, also called discovery services; and ii) a distributed or federated searching approach that provides direct, real-time access to information sources on the web without resorting to crawling or replicating or harvesting metadata. Examples of commercially available federated search solutions include 360 Search  and MetaLib (Serials Solution), Knimbus (GIST), Primo Central (Ex Libris) and  EBSCOhost Discovery Service (EBSCO) and Encore Discovery (Innovative Interfaces)  are examples of discovery services.

    Separate modules is devoted to access infrastructure, designing browsing, search and navigational interfaces, federated and discovery search solutions, etc.
  • 5. Network and Computing Infrastructure

    A typical digital library in a distributed client-server environment consists of hardware and software components at server side as well as at client’s side. Clients are machines that are used for accessing digital library by users while the server hosts databases, digital objects, browse, search and  navigational interfaces to facilitate its access.

    Computer hardware, software and network infrastructure for a digital library can broadly be divided into the following four  categories:

         i)          Server-side Hardware Components including input devices, storage devices, Communication Devices, etc.;

       ii)          Server-side Software Components including image capturing or scanning software, image enhancement and manipulation software, web servers, information retrieval software, Optical Character Recognition (OCR) software, Database Management System (BDMS) Software, Digital Rights Management (DRM), etc.;

     iii)          Client-side Hardware PCs, laptops and mobile devices; and  

     iv)          Client-side Software Components including Web browsers, Adobe’s Acrobat Reader, media players, word processing software, spread sheet software, image processing software, etc.

    A separate module is devoted to deal with computer hardware, software and network infrastructure requirement of a digital library. 
  • 6. Intellectual Property Rights (IPR) and Digital Rights Management 6.1. Intellectual Property Rights (IPR)

    Copyright has been called the "single most vexing barrier to digital library development" (Chepesuik, 1997). The current paper-based concept of copyright breaks down in the digital environment because the control of copies is lost. Digital objects are less fixed, easily copied, and remotely accessible by multiple users simultaneously. The libraries, unlike private businesses or publishers that own their information, are simply caretakers of information. Physical ownership or possession of material by a library is not necessarily an indicator of ownership of corresponding copyright. It is unlikely that libraries will ever be able to freely digitize and provide access to the copyrighted materials in their collections. Instead, the developers of digital libraries are obliged to take permission for inclusion of copyrighted material in digital form or develop mechanisms for managing copyright, mechanisms that allow them to provide information without violating copyright. Copyrights and IPR issues are governed by the constitutions of various countries and through international treaties like the Berne Convention.

    “Fair Use” is an exception to copyright protection that permits limited use of copyrighted material without explicit permission of the owner for non-commercial and non-profit educational purposes.  Protection and ownership of intellectual  property  in the age of electronic information are especially confusing in light of traditional copyright laws.  Discussions are taking place at various platforms to review the existing copyright laws in the light of electronic information.  Since the images are electronically forwarded around the Internet, it becomes very difficult to control and define what can  and can not be done.

    Copyright is manifested in terms of licenses and agreements in digital world. A library is required to sign licenses to acquire access to a digital collection. The terms of licenses for digital collection varies in terms of conditions, the variety of pricing models and access limitations (see Collection Development – licensing contents). The library associations and publishers are working on model licenses that can be adopted uniformly. The libraries can negotiate with the publishers on behalf of their institutions or as a consortium of libraries.

    A Conference on Fair Use (CONFU) in January  1996 working party comprising both publishers and Librarians began the process of developing practicable guidelines for fair use of electronic information.  The first discussions concerned the scanning and storage, reproduction and distribution of materials in an electronic preservation system.  The working party failed to agree on any guidelines but the dialogue is still alive and is expected to result in some guidance to both libraries and academics on what is permissible without prior permission (Cox,  1997).

    Section 108 of US Digital Millennium Copyright Act (DMCA-2000) gives libraries the right to archive upto three copies of unpublished or published materials owned by them for preservation or security purposes as long as copies are made to replace a copy the library has or used to have in its collection that has been damaged, deteriorating, lost or stolen, or the format has become obsolete. Such published works must also be out of print. The items being preserved can be in any format (text, images, sound, etc.). Furthermore, the copies can be digital, so long as they are not distributed digitally nor made available to the public in a digital format outside the premises of the library or archives.
  • 6.2. Digital Rights Management and Access Control in Digital Library

    Access management variably called, access control, terms and conditions, licensing conditions and Digital Rights Management (DRM) refers to control of access to digital libraries. Digital Rights Management (DRM) is a system of solutions created or designed as a means to prevent unauthorized access, duplication and illegal distribution of copyrighted digital media. The DRM technology was created for the publishers as a means to stop illegal reproduction and distribution of their products. In online environment, the scope of DRM can be leveraged to control access to and usage of digital objects and to impose restrictions on their misuse.

    Four distinct aspects of access management are: i) license agreements and policies; ii) user authentication and authorization); iii) accuracy and integrity of digital content; and iv) accessibility including permissions to operate on digital objects or its metadata. License agreements and policies are negotiated between the publishers and librarians or consortia coordinators for providing access to digital libraries. Users are authenticated and authorized to access content of a digital library as per the terms and conditions of license agreement. While users, duly authenticated, are allowed access to information according to their nature of clearances and authority, unauthorized users are blocked from accessing information. Confidentiality is of paramount importance in digital libraries containing national defence information or highly proprietary information. Accuracy or integrity means the continuing integrity of information stored in digital object servers. Digital library must not allow accidental or intentional corruption of information stored in it by unauthorized users or programs. Accessibility means that a secure computer system must keep information available to its users. The hardware and software of a computer system should keep working efficiently and the system should be able to recover quickly in case of disaster. Moreover, users are given access to digital contents with permissions to download (in case of users) and to add, edit, delete or amend in case of editors (Russell, D and Gangemi, G.T., 1991).

    It is not only essential to ensure security of data on servers and clients but also during communication between clients and servers and vice versa to ensure authenticity and integrity of data. It is possible for a hacker to eavesdrop on communication between a user's browser and a Web server and hack sensitive information, such as a credit card number, login ID and passwords or any other confidential data. A hacker could try to impersonate authorized users in order to get information which is normally not disclosed without authorization. Incidences of hackers getting access to important Web sites and defacing them are not uncommon. Techniques of data encryption are used for communicating sensitive information such as User’s password and PIN codes. Encryption renders data unintelligible and unusable even if accessed by an unauthorized person. Digital certificates are deployed to establish secure communication between clients and servers. 

    6.2.1. User Authentication

    A combination of one or more of the authetication mechanisms are deployed by the publishers for allowing access to the digital content to the authorized users hosted in digital libraries. These authentication mechanisms are: i) Log-in ID and Password-based Access; ii) IP Filtering; iii) Web Cookies; iv) Web Proxy; v)  Athens; vi)  Shibboleth;   and vii) Referring URL. These authentication mechanisms are described in detail in the module on access management.

  • 6.2.2. User Authorization


    The process of authentication ascertains the identity of a user, while authorization defines his or her permissions in terms of access to e-resources and extent of its usage. Authorization is granted to the successfully authenticate users according to his / her rights information available in the Access Management System (AMS). A user duly autheticated by one of the authentication mechanism described above may actually be entitled to access only a portion of digital collection subscribed by his / her institution. For example, an authenticated user may be authorised to access electronic journals from a publisher’s web site but not electronic books, reference sources or other resources dependeing on what his institution has subscribed to. Typically all users in an institution are authorized to access all the subscribed e-resources. However, it is possible to define different levels of authorization for different categories of personnel in an institution. Besides, authorizing users of a digital collection, authorization also addresses the issue of responsibilities assigned to different personnel invloved in development of a digital library and their respective authorities in terms of addition, deletion, editing and uploading of records into a digital library. Personnel involved in development of a digital library may be assigned different levels of authority. Authorization is more challenging than authentication, especially for widely distributed digital libraries. Access control is one method for enforcing authorization. Typically, it assumes that the user or entity has already been authenticated. Access control policies that are in vogue include i) Mandatory Access Control (MAC); ii) Discretionary Access Control (DAC);  iii) Role Based Access Control (RBAC); and  iv) Content Dependent Access Control (CDAC). These access control policies are described in detail in the module on access management. 

6.2.3. Technology of Access Control and Access Tracking in Digital Library

Page Contents  

A number of copy-protection and access control technologies have been devised that would either restrict or completely stop unauthorised use of copyrighted digital material. These technologies include i) Digital Watermarking; ii) Control on Extent of Use and Subsequent Use; iii) Fractional or Partial Access; iv) Flickering; v) andDigital Object Identifier (DOI). These access control technologies are described in detail in the module on access management.  

 

A separate module is devoted to deal with Intellectual Property Rights (IPR) andDigital Rights Management (DRM). 

 

  • 7. Digital Library Services


    The library research and development in digital libraries, in the beginning, was focused mainly towards providing search and browsing interface to its collection. However, providing access to its resources is only one of the several services offered by a traditional library to its users. Reference services, for example, provide personalized services to a user with human touch. The importance of reference service has increased many-fold with introduction of new information technologies in libraries. Users, who are not well versed with use of web and Internet technology, find it difficult to retrieve information from plethora of resources accessible to them from various digital repositories. Sloan (1998) emphasised that technology and information sources, on its own, cannot make up an effective digital library. Helping users in finding resources, either in physical or electronic environment, is the foremost task of a librarian. 

    The digital resources and associated technical infrastructure is only a means to generate services keeping its potential users in mind. Like printed resources are used in traditional libraries to generate services by the library staff, the digital resources are used to generate services using software driven web-based interfaces.  Computer programs substitute for the intellectually demanding tasks that are traditionally carried out by skilled professionals.  Activities that require considerable mental activities, like reference service cataloguing and indexing, seeking information, etc. are performed by computer programs through web-based interface with or without human interventions.

    Web-based digital resources can potentially support a range of traditional and non-traditional library services. While it is recognized that librarians may not be responsible for the design and implementation of digital library infrastructure, they, as managers of digital libraries, are responsible for generating and creating awareness about digital library-based services. Most of the library services generated using digital resources resemble closely to those generated manually with improvements and modifications to suit the requirements of automated services.  However, digital resources have also been used to generate innovative services that did not have a counterpart in manual parlances.  While a separate module deals with digital library services in detail, these services are mentioned briefly here.

  • 7.1. E-mail Alerts



    The service, variably called as E-mail Alert, Table of Contents Alert, News Alert, etc., offer the ability to set up an e-mail alerts for the table of contents from a specific journal or group of journals by the end user. A user can subscribe to e-mail alerts to get periodic emails with links to new content automatically that are added to the publisher’s web site. The service, offered by most of the digital libraries and databases, can broadly be equated to Current Awareness Services (CAS) offered by traditional libraries.

    The first time when a user requests an e-mail alert or table of contents alert, he or she is required to create a personal user profile / user login. A user is prompted to provide details such as name, email address, postal address, field of interest, user name, password, etc. Once these details are filled-in and a login ID and password is assigned to the user, he / she is required to login on to the publisher’s web site and then from there he / she can start creating his / her user profile. A user may select journal titles or subject areas that he / she would like to receive regular email alerts for. All e-journal publishers that provide an email alerting services, provide some kind of on-line help and /or FAQs. Publishers offer a variety of email alerts including ToC alerts, new issue alerts, citations alerts, publications alerts, online first alerts, search alerts, favourite journals alerts, etc.

7.2. Web Feeds: RSS Feeds or Atom

Web feeds are data formats used for providing users with frequently updated content. The two main web feed formats are RSS and Atom. RSS stands for Real Simple Syndication or Rich Site Summary and Atom format was developed as an alternative to RSS. The technology, on one hand allows a web site to list the newest published updates (like table of contents of journals, new articles) through a technology called XML, on the other hand, it facilitates a web users to keep track new updates on chosen website(s). Like a personal search assistant, RSS feed readers visit pre-defined web sites, look for updated information and fetch it automatically on to the user’s desktop. In order to use RSS Feed, users are required to download RSS feed reader or RSS feed aggregator, which can be web-based, desktop-based, or mobile-device-based and then “subscribe” to the RSS feeds by copying a link from the web site of a digital repository into their feed reader. The reader can then check the subscribed feeds to see if any of those feeds have new content since the last time it was checked, and if so, retrieve that content and present it to the user. Both RSS and Atom are supported by most of the feed readers.

Digital repositories of most of the publishers provide RSS Feed for delivering contents of their journals to their users. RSS feeds on web pages are typically represented by a rectangle with the letters  or . Users generally have a choice to get all the contents of issues of a journal or get contents on a given topic or subject.

7.3. Ask-An-Expert

Page Contents 

“Ask-An-Expert” is a service offered by several digital libraries and databases. It is an Internet-based question and answer service that connect users with experts who possess specialized subject knowledge and skill in a given domain. Digital libraries and database provides this service as platform to connect users with experts who can answer specific question and instruct users on developing certain skills. This service is often restricted to a given user community and subscribers of a database or full-text resource.

7.4. Electronic Document Delivery Services

The term "electronic document delivery systems" implies delivery of electronic version of a document that might involve reproduction of an electronic copy of a document if it is not already available in electronic format. However, with availability of most of the peer reviewed research journals in electronic format, most publishers and aggregators facilitate online electronic document delivery services that allow a user to download an article in full-text from their site at a pre-determined cost. Different publishers and aggregators have offers different payment options, i.e. some charge each time the journal is used, whereas others provide restriction-free access for an annual subscription. 

7.5. Web-based User Education

Publishers of full-text electronic resources and bibliographic databases offer web-based user’s guides, teaching tools and “spoken tutorials” facilitating users to make best use of resources made available by the publisher. These guides may include colour graphics, screenshots and animations. The web-based user education provides a high degree of interactivity and flexibility to the users offering them the benefit of self-pace, graduated to teach from basic to highly advanced levels and designed in a wide range of formats that accommodate diverse learning styles. The proliferation of web-based full-text e-resources and databases has generated greater demands for such reference and instructional services. With availability of digital resources that can be used anywhere, at any time, requirement for instructional and reference services has also grow.  

7.6 Digital Reference Service

Digital Reference Service, also called Virtual Reference Service or “Ask-A-Librarian” is a service wherein libraries or a similar voluntary organization offer reference services to the users typically through e-mail or via web. It is generally considered as an extension of library’s existing reference service to their users who are not able to visit library in person. In case of voluntary organizations offering digital reference services, people who serve as digital reference experts (also called volunteers or mentors) are most of the time information specialists, affiliated to various libraries. Most digital reference services have a web-based question submission form or an e-mail address or both.  Users may submit questions by using either form. Once a question is read by a service, it is assigned to an individual expert for answering.  An expert responds to the question with factual information and or a list of information resources.  The response is either sent to the user’s e-mail account or is posted on the web so that the user can access it after a certain period of time. Many services have informative web sites that include archives of questions and answers and a set of FAQs.  Users are usually encouraged to browse archives and FAQs before submitting a question in case sufficient information already exists.

The QuestionPoint service (www.QuestionPoint.org), is a subscription base service that provides libraries with access to a growing collaborative network of reference librarians in the United States and around the world. Library patrons can submit their questions at any time of the day or night through their library's Web site. The questions will be answered online by qualified library staff from the patron's own library or may be forwarded to a participating library around the world.

7.7. Real time Digital Reference Service: Library Chat Rooms

Many libraries are experimenting with Internet chat technology as an innovative method for offering real time digital reference service, using chat software, live interactive communication software, call counter management software, web contact software, bulletin board services, interactive customer assistance system, etc. While digital reference service is asynchronous method of information delivery, the Internet chat providing the benefit of synchronous communication between a user and a reference librarian (or mentor).  Interactive reference services facilitate a user to talk to a real, live reference librarian at any time of day or night from anywhere in the world. Unlike with email reference, the librarian can perform a reference interview of a sort by seeking clarifications from the user. The librarian can conduct Internet searches and push websites onto the patron’s browser, and can receive immediate feedback from the patron as to whether his or her question has been answered to his satisfaction. Several institutions in US including Cornell University, Internet Public Library, Michigan State University, North Carolina University are offering Internet chat-based service using software like LivePerson, AOL Instant Messenger, Conference Room and Google Talk.
LiveRef (sm) (http://www.public.iastate.edu/~CYBERSTACKS/LiveRef.htm) maintains an online registry of real-time digital reference services

7.8. My Settings, My Saved Searches and My Saved Articles

These features facilitate a user to define his or her preferences, to save and retrieve previously saved search strategies and to view articles that have been saved in previously sessions. 

8. Summary

Establishing digital library resources and services require a great deal of new infrastructural components that are not available off-the-shelf as packaged solution. There are no turn-key, monolithic systems available for digital libraries, instead digital libraries are collection of disparate systems and resources connected through a network, and integrated within one interface, currently the web interface. Use of open architecture and standard protocols, however, make it possible that pieces of required infrastructure, be it hardware, software or accessories, are gathered from different vendors in the marketplace and integrated to construct a working environment. Components required for a digital library can broadly be divided into the following six categories:

i)        Collection Infrastructure: The most important component of a digital library is the digital collection it holds or has access to. Viability and extent of usefulness of a digital library would depend upon the critical mass of digital collection it has. The collection in a digital library consists of i) collection acquired in digital media; ii) access bought for the external digital collections; iii) converting datasets that are "Borne Digital"; iv) conversion of existing print media into digital format; and v) creating portal sites or gateways to the electronic collections available on the web.

ii)      Digital knowledge Organization: An effective and efficient access mechanism that allow a user to browse, search and navigate digital resources becomes necessary in a digital library as electronics resources of a collection grow in number and complexion. As digital libraries are built around Web and Internet Technology, it uses object and addressing protocols of the Internet. The process of organizing digital objects includes: a) developing metadata schema; b) assigning metadata to each digital object; c) assigning Unique Object Identifiers to each digital object; d) linking digital objects with associated metadata to facilitate their browsing, searching and navigation; e) organizing digital objects and associated metadata into a database; and f) build browse, search and navigation interfaces.

iii)    Access Infrastructure: Access infrastructure for digital libraries consists of browse, search and navigational interfaces for individual digital libraries, specialized indices for specialized local collections, portals or subject gateways for web resources and an integrated interface for all e-resources accessible to a given library including traditional library resources.

iv)  Computer and Network Infrastructure: A typical digital library in a distributed client-server environment consists of hardware and software components at server side as well as at client's side.

v)    Intellectual Property Rights (IPR) and Digital Rights Management:  Copyright is manifested in terms of licenses and agreements in digital world. A library is required to sign licenses to acquire access to a digital collection. Publishers also deploy technological tools for authentication, access control and access tracking to protect their Intellectual Property Rights.

vi)    Digital Library Services: Like printed resources are used in traditional libraries to generate services by the library staff, the digital resources are used to generate services using software driven web-based interfaces. Web-based digital resources can potentially support a range of traditional and non-traditional library services. Most of the library services generated using digital resources resemble closely to those generated manually with improvements and modifications to suit the requirements of automated services.  However, digital resources have also been used to generate innovative services that did not have a counterpart in manual parlances. Some of the important services offered by most of the digital repositories and digital collections include: a) E-mail alerts; b) RSS feeds or Atom feeds: c) Ask-an-Expert; d) Electronic document delivery services; e) Web-based user education; f) Digital reference service; g) Real-time reference service; and h) My settings, my saved searches and my saved articles.

References

Arora, Jagdish. Building digital libraries: An overview.  DESIDOC Bulletin of Information Technology, 21(6), 3-24, 2001.

Chepesuik, R. The future is here: America's libraries go digital. American Libraries, 2(1), 47-49, 1997.

Cox, John E. (1997). Publishers, publishing and the Internet: how journal publishing will survive and prosper in the electronic age.  Electronic Library, 15(2), 125-131.

Digital Millennium Copyright Act (1998) of USA (http://www.copyright.gov/legislation/dmca.pdf) (Last visited on 30th January, 2013)

Flecker, D. Preserving scholarly e-journals. D-Lib Magazine, 7(9), 2001. (http://www.dlib.org/dlib/september01/flecker/09flecker.html)

Marchionini, G., Plaisant, C. and Komlodi, A. Interfaces and tools for the Library of Congress National Digital Library Program. Information Processing and Management, 34(5),535-555,1998.

National Geographic Map Machine  (http://maps.nationalgeographic.com/map-machine) (Last visited on 30th January, 2013)

Networked Digital Libraries of Theses and dissertations (NDLTD) project (http://www.ndltd.org/). (Last visited on 30th January, 2013)

Russell, D and Gangemi, G.T. Computer security basics. Sebastopol, CA, O’Reilly, 1991.

Sloan,  Bernard G. Services perspectives for the digital library: Remote reference services. Library Trends, 47(2), 1998.



Web Links

No comments: