Tuesday, December 10, 2013

16. Limitations of Bibliometrics and Scientometrics P- 07. Informetrics & Scientometrics

इस ब्लॉग्स को सृजन करने में आप सभी से सादर सुझाव आमंत्रित हैं , कृपया अपने सुझाव और प्रविष्टियाँ प्रेषित करे , इसका संपूर्ण कार्य क्षेत्र विश्व ज्ञान समुदाय हैं , जो सभी प्रतियोगियों के कॅरिअर निर्माण महत्त्वपूर्ण योगदान देगा ,आप अपने सुझाव इस मेल पत्ते पर भेज सकते हैं - chandrashekhar.malav@yahoo.com

16. Limitations of Bibliometrics and Scientometrics


P- 07. Informetrics & Scientometrics *

By :I K Ravichandra Rao,Paper Coordinator


16. Limitations of Bibliometrics and Scientometrics   


Objectives


After going through this Unit you will be able to:
  • Understand the importance of bibliometric analysis
  • Identify the various indicators used in bibliometric studies
  • Classify the indicators according their purpose
  • Appreciate that a number of important qualifications (and limitations) must be borne in mind when assessing the validity of bibliometric analysis.


Summary



                Bibliometrics is a set of mathematical and statistical tool used to measure quantity and quality of books, articles, and other genera of publications.  Studies indicate that bibliometrics, as a metric, is objective, quantitative and un-obstructive method used for  .  However, it is not without any limitations.  The limitations in citation analysis is discussed in the Unit   Critics have also shown the flaws in most of the indicators used in bibliometric studies.   Each of these indicators along with their advantages and limitations has been explained.  The bibliometric analysis and the use of indicators would serve the purpose when used with care.  This unit provides an overview of the currently used bibliometric indicators and summarizes the limitations and characteristics one should be aware of when evaluating the quantity and quality of scientific output.

Introduction

Bibliometrics, a term coined by Pritchard in 1969,  is a measure used to understand the output and impact of scientific communication.  Publications and Citations are the two important variables normally used in bibliometrics.  Bibliometrics has arguably provided ways and means of benchmarking and evaluation of scholarly work.  In recently years bibliometrics has been a growing interest in the subject yielding to various rankings and numbers. 
Bibliometrics – as a method and as a discipline -  has received a great deal of significance since its inception.  One important aspect of the increasing interest in bibliometrics, at the libraries as well as in academia in general is the growth in use of bibliometrics to evaluate research performance, “especially in university and government labs, and also by policymakers, research directors and administrators, information specialists and librarians and researchers themselves” (Pendlebury, 2009).The objectivity with which the assessments can be made and repeatability of the analyses are basic reasons for its popularity.  Other reason, for accepting bibliometrics as a measurement tool, is it is relatively inexpensive in terms of time, money and effort provided a good data source is available.  Scalability is one of the main advantages of bibliometrics as a tool. In other words, it can be applied from a micro level, i.e., an individual research or institute, to a macro level, i.e., country or world.  Ability for comparative analyses – temporal, geographic, linguistic, biographic, etc. – in bibliometrics has drawn the attention of many scholars.  It has been universally accepted as an ideal method for assessing the research production. 
The use in library administration is one of the early applications of bibliometrics. Use of bibliometrics in collection development and management in libraries is a well known practice, not the least in relation to digital library management.    As a student of Library and Information Science (LIS), you would be interested to know that bibliometrics is a well-established part of LIS research. There has been an increase in the number of research activities by the LIS profession in recent years (Naseer and Mahmood, 2009). 
Not everything is green for bibliometrics.  It has received some criticisms also.  While bibliometric data bring useful information, the implementation often seems to arise from a loss of critical and rational mind.  We will try to understand the limitations of bibliometrics also in the subsequent section of this Unit.

Alternate Text

Limitations of Bibliometric Analysis

Bibliometrics undoubtedly has distinct strengths.  However, it also subjected to several valid criticisms.  Using bibliometrics to measure the quality in scholarly work has attracted attention of the critiques.  As Laloë and Mosser (2009) put it, the use of indices such as the H-index at the level of individuals is easy and therefore attractive, but mostly unscientific.  Publication counts, one of the widely used indicators, is criticised for this measure of only the quantity and not the quality. Publication counts have also been subject to other criticisms, such as problems associated with gratuitous co-authoring of articles; different publication practices across fields; and difficulties of defining fields of research especially given strong trends towards collaborative research (Lundberg, 2006).  These kinds of criticisms can be seen for other measures also. 
Citation analysis, one the most applied bibliometric technique, is also not far from receiving the criticisms.  Let us try to understand the limitations of citation analysis in more detail in the subsequent section.


Limitations of Citation Analysis

There exists a relationship between cited and citing documents; and a citation represents the relationship between them.  This is the fundamental principle on which citation analysis is based upon.  The citation analysis is the area of bibliometrics which deals with the study of this relationship.  Undoubtedly, citation analysis is the most used and useful technique of all.  As a concept, it has been responsible for the development most of the citation databases such Thomson Reuters’s Citation Index, Elsiever’s Scopus, Google’s Google Scholar and so on. 

Why do researchers cite?  What is the citers’ motivation for referencing earlier work?  These questions have been comprehensively answered by Garfield (1979), the person who introduced the citation analysis for the first time.  He gives a list of 15 reasons for authors to cite earlier works.  On examination of these reasons, one may find some of them are really show the scholarly impact; and some of them indicate that they have less-than-noble intention behind them.  It is the second category of reasons which attract the attention of the critics; and hampers the very purpose of citation analysis. 

Citation analysis is based on a few assumptions: 1) Citation of a document implies use of that document by the citing author; 2) Citation of document (author, journal, etc.)  reflects the merit (quality, significance, impact) of that document (author, journal, etc.); 3) Citations are made to best possible works; 4) A cited document is semantically related in content to the citing document (if two documents are bibliographically coupled, they are related in content; and if two documents are co-cited, they are related in content); and 5) All citations are equal.  All these assumptions have inherent flaws in them which are succinctly described below:
  • Criticism about the first assumption:  The first assumption implies that the citing author has either partly or fully used/influenced by the ideas in the cited work; and secondly that all the cited documents were indeed used by the citing author.  Failure to meet any one of these conditions amounts to sins of omission and commission.  These ‘sins’ will have a negative impact on the fundamental principle on which citation analysis is based upon. 
  • Criticism about the second assumption:   The citations received by a document show its quality is the meaning of this assumption.   On the face value, one may say that there is a positive correlation between the number of citations received and the quality of the article.  But it is not always true.  Sometimes, citations are received for wrong reasons also.  It has been shown by many studies that the authors have the habit of giving citations to the spurious works also. This also shakes the very fundamentals of citation analysis. 
  • Criticism about the third assumption:  Studies often show that not all good works are cited. Number of Citations, among other reasons, also depends upon the availability (accessibility) of the documents to the authors.  Accessibility of a document may be a function of its form, place of origin, age, and language. Hence, non-availability of documents may have a negative impact on the citation analysis.
  • Criticism about the fourth assumption:  This assumption has a bearing on the information retrieval function of the citation databases.  Martyn (1964) contends that a bibliographic coupling is not a valid unit of measurement because one does not know that two documents citing a third are citing the identical unit of information in it. The same applies to co-citation as well; the fact that two papers are co-cited does not guarantee a relation- ship between their contents.
  • Criticism about the fifth assumption:  It is not difficult to see that all the papers cited in an article are not of equal importance in the context of the citing work.  However, this varied importance of the cited articles are neither reflected in the reference list nor used for citation analysis.  This might have a negative impact on the overall results. 

Apart from the above, as Smith (1981) lists, the other parameters which may induce errors in the citation analysis are:  multiple authorship, self-citations, homographs, synonyms, types of sources, implicit citations, fluctuations with time, field variations, and errors.  The limitations of citation analysis do not negate its value as a research method when used with care. There are, in fact, several application areas where citation analysis has been used successfully.





Alternate TextAlternate TextAlternate Text
Alternate Text
Alternate Text


Characteristics and Limitations of Bibliometric Indicators

Page ContentsPublication Indicators

            It is not uncommon to bibliometrics as the tool for research assessment as it is considered as an objective, quantitative and un-obstructive method.  These bibliometric indicators have become important for researchers and organizations.  For researchers they are important because they provide objective mode of assessment of diffusion and impact of articles in their works.  For organizations, the bibliometric indicators are significant because they are the means of assessing the quality of particular work, person, or group.  However, as said in the last segment of the previous section, the usability of the indicators has to be properly assessed before applying them for quantification.  When applied judiciously with care and caution, the bibliometric indicators provide valid results useful for decision making.
The indicators are normally used as unit of analysis in various bibliometric studies.   The indicators can be categorised, for the sake of explanation here, into three groups, viz., Publication Indicators; Citation Indicators; and Journal Indicators.  The following section provides a brief explanations of indicators used in bibliometrics under each of these categories along with their advantages and limitations, if any.  The list is compiled from the list given by Rehn (2007).

Publication Indicators

  • Number of Publications:  It is the number of publications published by either author, institute, country or so on.  The time span is also taken in many situations to suit the temporal scope.  The data is collected either directly from the original publications from databases. It is relatively easy to collect data. Although this count is a very straightforward indicator that can be easily calculated by the authors themselves, one must be very careful when using it to compare authors or research groups. The disadvantages of this indicator are:  when used does not take the size of the analysed unit; and does not speak of the impact of the publications counted.
  • Number of ISI Publications:  It is the number of publications indexed by Thomson ISI indices.  Temporal and geographic filters are applied many a times in many studies.  Quite easy to collect data as it can be directly collected from the databases.   The disadvantages are: when used does not take the size of the analysed unit; has inherent the problem of scope and coverage as that of ISI indices; and does not count non-ISI publications.
  • Number of Publications in Top Journals:  It is the number of publications the analyzed unit has published in a selected number of journals during the analyzed time span.  The selection of journals is usually made on some criteria.   The advantage is that as the data is collected from top journals (which show their relative importance among others in the group); it is a better count than a mere publication count.  The disadvantages are: does not take the size of the analyzed unit into account; and has the limitations of the selection criteria. Although this approach may look like a performance indicator, it was designed to address the shortcoming of the above-mentioned quantity indicator.

Citation Indicators


  • Number of citations:  It is the total number of references received from other works, i.e., number of citations to articles published by an analyzed unit during the analyzed time span.  The citation of one article by another is characteristic of scientific publications, and it is generally accepted that the number of citations of a particular article is a reflection of its impact in the scientific community (Rhen, 2006). The data has to be collected from the citation databases such as Thomson Reuter Web of Knowledge, Scopus, Google Scholar, CiteseerX and so on. As collected from databases, data and results are verifiable.  Limitations of this indicator include:  it does not take into account older articles usually are more cited and that citation rates vary between document types and subject areas; and does not compensate for size of the unit.

Alternate Text


  • Citations per publication:   It is the average number of citations to articles published by an analyzed unit during the analyzed time span.  It is calculated by calculating the total publications and dividing the total by number of publications considered.  The limitations are: Does not take into account that older articles usually are more cited if a variable, cumulative citation time window is used, and that citation rates vary between document types and subject areas.
  • Field Normalized Score:  This indicator corresponds to the relative number of citations to publications from a specific unit, compared to the world average of citations to publications of the same document type, age and subject area.  It is calculated as follows: The number of citations to each of the unit’s publications is normalized by dividing it with the world average of citations to publications of the same document type, publication year and subject area, which is called the field reference value (µf). If an article is classified as belonging to several subject areas, a mean value of the areas is used.  The disadvantage include if the normalization is done on an article level, a few highly cited articles in a moderately cited research area may contribute un-proportionately to the value of the field normalized citation score.
  • Total field normalized citation score:  This indicator gives an indication of both the impact and the production volume of  the analyzed unit. The score is got by adding together the item oriented field normalized citation scores for all the publications of the analyzed unit. The disadvantage is it does not compensate for the size of the analyzed unit.
  • Journal normalized citation score:  This indicator corresponds to the number of citations to publications from a specific unit during an analyzed time span, compared to the world average of citations to publications of the same document types, ages and in the same journals.  The calculation is as follows: the number of citations to each of the unit’s publications is normalized by dividing it with the world average of citations to publications of the same document type, published the same year in the same journal.  The indicator is the mean value of all the normalized citation counts for the unit’s publications. 
  • Crown indicator:  It is developed by Center for Science and Technologies Studies at Leiden University.  It intends to measure the scientific impact of a researcher or a research group.  This indicator is calculated by dividing the average number of received citations (from a researcher or a research group) by the average number that could be expected for publications of the same type, during the same year, and published in journals within the same field (Lundberg, 2007).  It has a few flaws also.  First,  its dependence on categories published by Thomoson Reuters leads to a problem that  it does not take into account that publications from a particular field are often published in journals categorized in another field.  Second, the size of a research group influences its productivity —  quite simply, the more researchers in a group, the larger the number of published articles. It is therefore recommended to compare research groups with the crown indicator only if the groups are of similar sizes ((Lundberg, 2007). 

Alternate Text








  • h-index:  The h-index is an index that attempts to measure both the scientific productivity and the apparent scientific impact of a scientist. J.E. Hirsch introduced it in 2005 and defined it in the following way: “A scientist has index h if h of [his/her] Np papers have at least h citations each, and the other (Np -h)papers have at most h citations each”. The “Web of Science” now gives direct access to the H index in a few mouse clicks. It is calculated as follows: find the unit’s published articles in a citation index and sort them in descending order by number of citations. Count articles from the top of the list and downwards, and when the number of an article rises above the citation count for that very article, the number of the preceding article is to be counted as the h-index. H-index is criticized as it  gives positive bias to senior researchers with older articles, since these have had more time to be cited, though the demand that new articles with comparable citation levels has to be added has a certain damping effect on that bias.
  • Uncitedness:   It is the share of a unit’s publications that that remain uncited after a certain time period. Self-citations should be removed from the citation count.  It requires data from a comprehensive citation database such as the Thomson citation indices and validation of the unit’s publications.
  • Self-citation:  It is the share of a unit’s received citations where authors refer to their own papers. The calculation of the self-citation is as follows: count the total number of citations to the unit’s publications during the analyzed time span.  Check where citations are coming from and count the number coming  from the unit itself. Divide the second number with the first to get share of self citedness. The requirement for getting self-citation is that it requires data from a comprehensive citation database such as the Thomson citation indices, validation of publications and analysis of citing articles, which can be done in the ISI Web of Science. 

Journal Indicators


  • Impact factor: The impact factor (IF) of an academic journal is a measure reflecting the average number of citations to recent articles published in the journal and intended to gauge importance of a journal in its given field.  It is perceived that higher impact factor of a journal more important it is in that field than those with lower ones. The impact factor was devised by Eugene Garfield, the founder of the Institute for Scientific Information. Impact factors are calculated yearly for those journals that are indexed in the Journal Citation Reports.   IFs are available in the SCI (Science Citation Index) Journal Citation Reports and on the Web of Knowledge for more than 8000 selected scientific journals.  The IF does have several limitations (Durieux & Gevenois, 2010). First, although a higher IF can suggest a greater impact of a journal, it does not reflect the quality of each particular article published by that journal. Consequently, it is not clear whether a high IF is due to a moderate degree of citation of all of the articles published or to a high degree of citation of only some articles. Second, multidisciplinary journals usually have a higher IF than specialized journals.  Third, there are differences between research fields, including in research intensity. The highest ranking journal in each specialized field may have a very different IF from specialty to specialty. Fourth, the types of articles published by a journal also influence its IF. Review articles and technical reports are more frequently quoted than are original research articles, case reports, and pictorial essays.
  • Normalized Journal Impact Factor:  This indicator corresponds to the relative number of citations to publications in one specific journal, compared to the world average of citations to publications of the same document type, age and subject area.  The indicator is stated as a decimal number that shows the relation of the number of citations to the world average. As an example, 0.9 means that publications in this journal are cited 10% below average and 1.2 that they are cited 20% above average. 
  • Immediacy Index:  It is an indicator which measures the current importance of the work published by a journal by calculating the average number of times articles published during a particular year by a specific journal is cited over the course of that same year.  The immediacy index is useful for identifying the journals publishing the articles in the emerging areas.  It is said that immediacy index has an unintended bias towards articles published in the earlier part of the year as they would have better and more chance to get cited than those articles published later in the year. 
  • Journal-to-field impact score: The journal-to-field impact score has been proposed by the Center for Science and Technologies Studies of Leiden University (Leiden, the Netherlands) as an alternative to the IF. It measures the average number of cited articles in a specific journal and compares this number with that of other journals in the same research field category.  The field categorization of journals is based on the journal subject categories, which are defined by Thomson Reuters. By ranking journals in a given subject category, this score overcomes the limitations of IF related to research field characteristics such as productivity, citation habits, and citation dynamics.


  • Export Content

Conclusion


Page ContentsReferences

Bibliometric analysis is based  on  many indicators.  Biliographic indicators seek to  measure the quantity  and impact of scientific publications.  Bibliometric indicators are used increasingly in evaluation processes in universities, and public and private research institutes.  Each of these indicators has its own merits and limitations.  Interpreted as exhaustive measures of scientific output, bibliometric indicators would present a biased story. However, when used with caution, they can reveal some insights through trends regarding aspects of scientific production at global level.
As a conclusion, let us bear in mind the following words-of-caution given by Durieux & Gevenois (2010) while using the bibliometric indicators for analysis:
  • Performance indicators are based on the assumption that the quality of a particular article is reflected by the frequency of its citations in other articles.
  • Given differences between fields of research in terms of productivity, citation habits, and citation dynamics, bibliometric indicators should not be used for comparing researchers, research groups, or journals from different fields.
  • It is recommended to measure the quality and the impact of scientific journals, research groups, or particular researchers through several indicators rather than only one.



References

Durieux, V., & Gevenois, P. A. (2010). Bibliometric Indicators: Quality Measurements of Scientific Publication 1. Radiology255(2), 342-351.

Garfield, E. (1979). Citation indexing: Its theory and application in science, technology, and humanities (Vol. 8). New York: Wiley.

Laloë, F., & Mosseri, R. (2009). Bibliometric evaluation of individual researchers: not even right... not even wrong!. Europhysics News40(5), 26-29.

Lundberg, J. (2006). Bibliometrics as a research assessment tool: impact beyond the impact factor. Doctoral Thesis.  Sweden:Korolinska Instituet.

Lundberg, J. (2007). Lifting the crown—citation  z-score. Journal of informetrics,1(2), 145-154.

Martyn, J. (1964). Bibliographic coupling. Journal of documentation20(4), 236-236.

Naseer, M. M., & Mahmood, K. (2009). Use of bibliometrics in LIS research.LIBRES: Library and Information Science Research Electronic Journal19(2).

Pendlebury, D.A. (2009). Whitepaper: Using bibliometrics: A guide to evaluating research performance with citation data. Philadelphia: Thomson Reuters.

Rehn, C., & Kronman, U. (2008). Bibliometric handbook for Karolinska Institutet. Huddinge: Karolinska Institutet.

Rehn, U. K. C. (2007). Bibliometric indicators–definitions and usage at Karolinska Institutet.

Smith, L. C. (1981). Citation analysis. Library trends30(1), 83-106.




No comments: