इस ब्लॉग्स को सृजन करने में आप सभी से सादर सुझाव आमंत्रित हैं , कृपया अपने सुझाव और प्रविष्टियाँ प्रेषित करे , इसका संपूर्ण कार्य क्षेत्र विश्व ज्ञान समुदाय हैं , जो सभी प्रतियोगियों के कॅरिअर निर्माण महत्त्वपूर्ण योगदान देगा ,आप अपने सुझाव इस मेल पत्ते पर भेज सकते हैं - chandrashekhar.malav@yahoo.com
11. Science Indicators
P- 07. Informetrics & Scientometrics *
By :I K Ravichandra Rao,Paper Coordinator
11. Science Indicators
P- 07. Informetrics & Scientometrics *
1. MORE LEARNING IN AUDIO FORMS
http://epgp.inflibnet.ac.in/vt/ls/infsci/science_indicators/Science%20Indicators_SL.mp4
2.E-TEXTS
http://epgp.inflibnet.ac.in/vt/ls/infsci/science_indicators/Science%20Indicators_etext.mp4
Content Navigation (e-Text)
Browse through the whole course site from one location.
Objectives
- Understand the meaning of indicator
- Understand the role of Science and Technology indicators
- Conceptual understanding of Scientometrics/ bibliometrics
- Illustrate some key indicators derived from bibliometrics
And show their applications.
Summary
The purpose of the indicators is discussed. The factors to be considered for construction of indicators are discussed. the three basic laws of bibliometrics are discussed in the context of indicators.
Introduction
Science Policy cannot be made in isolation. It must centre on S&T Trends and their relation to current public policy developments. S&T policy must be based on evidence – Authentic data collected and analyzed on a regular basis.
For Policy Input, it is important to derive the salient and important trends and indications from R&D statistics. In spite of some major inherent limitations, the approach of analyzing scientific activity through S&T indicators has come to stay. The debate has now shifted to how well one can construct better indicators and collects statistics: what would be the proxies to measure the various parameters, and how to create composite indicators.
Indicators should help to describe the science and technology system, enabling better understanding of its structure, of the impact of policies and programs on it, and of the impact of science and technology on society and the economy
|
It should be kept in mind that every indicator sends a signal. It conveys information about a particular element or a sub-element that it represents, but not about the system as a whole. An ideal indicator should be representative—it should cover the most important aspects of the elements concerned. It should be reliable — in that it should directly reflect how far the objective concerned is met, well founded, accurate, measured in a standardized way; and feasible—data should be readily available, and at reasonable cost.
Indicators are different from statistics:
- they measure the dimensions of a phenomenon; they are based on statistics that can be recurrent; they usually appear as a collection of statistics.
Price (1978) formulated the same requirement in the following way:
“To be meaningful, a statistic must be somehow anticipatable from its internal structure or its relation to data. It means the establishment of a set of simple and fundamental laws.”
Construction of Indicators: Key Considerations
Basic Steps:
Concept ─ Subdivide the concept according to several set of dimensions
Create indicators for measuring each of the dimensions
Creation of composite indicators
Indicator construction are based on Direct measures, Indirect measures
There are two issues of primary importance: Reliability (enabling different people to use them and get consistent results) and Validity- based on identifiable criteria that measure what they are intended to measure
Common Indicators of R&D & Innovation: Strengths and Weaknesses
MEASURE
|
STREGTHS
|
WEAKNESES
|
R&D Activities
|
Regular & Recognised data on main source of technology
|
Lacks detail
Underestimates small firms, design, prodn eng.
|
Research Papers
|
Good proxy to assess scientific research
|
Tacit & strategic knowledge not captured
|
Patents
Standards
|
Regular detailed & long term data
Adoption indication
|
Uneven propensity to patent
Uneven propensity,
Long complex documents
|
Significant Innovations
|
Direct measure of output
|
Measure of significance
Cost of collection
Misses Incremental changes
|
Innovation Surveys
|
Direct measure of output
Comprehensive coverage
|
Variable definition of innovation
Cost
|
Expert Judgments
|
Direct use of expertise
|
Finding independent expertise
Judgements beyond expertise
|
Product Announcements
|
Close to commercialization
|
Misses In-house process innovations
Misses incremental product improvements
|
What is Scientometrics ?
The quantitative approach to characterize scientific activity emerged as a new strand of research within science and technology studies in 1960’s. Within this quantitative approach which was coined ‘Scientometics’, a research community became very active who were largely concerned with measuring the communication process of science. This research activity is called ‘bibliometrics’ and largely overlaps with scientometrics.
The term Scientometrics originated as a Russian term for the application of quantitative methods to the history of science. The scope and objectives have widened considerably.
Scientometrics is a wide-ranging field with vague boundaries. It is a generic term for a system of knowledge which endeavours to study the scientific (and technological) system, using a variety of approaches within the area of Science and Technology Studies (STS).
Scientometrics has followed the trajectory of econometrics in the use of quantitative data, concepts and models and extensive use of mathematical and statistical technique of modelling and data analysis.
This vision also implies the use of Scientometrics for decision-making in science policy, just like the use of econometrics for decision–making in economic policy.
Bibliometric, especially evaluative bibliometrics, uses counts of publications, patents, citations and other potentially informative items to develop science and technology performance indicators.
There are implicit assumptions/propositions that underlay the utilisations and validity of bibliometric analysis.
- One of them is Activity measurement that proposes that counts of patents and papers provide valid indicators of R&D activity in the subject areas of those patents and papers, and at the institutions from which they originate.
- The second important proposition is Impact measurement, in which it is proposed that the number of times those patents and papers are cited in subsequent patents or papers provides valid indicators of the impact or importance of the cited patents and papers.
- The Third Important Proposition is Linkage Measurement.
In this it is proposed that citations from papers to papers, from patents to patents, and from patents to papers, provide indicators of intellectual linkages
Among the organisations that are producing the patents and papers, Knowledge linkages among their subject areas.
The application of Bibliometric Analysis can be under four levels:
(a) Evaluation of National or Regional technical performance (policy level);
(b) Evaluation of Scientific Performance of universities or technological performance of company (strategic level);
(c) Tracing and Tracking R&D Activity in specific scientific and technological areas or problems (tactic level); science-technology linkage, etc. and
(d) Identifying specific activities and specific people engaged in R&D (conventional level).
Elements, units and levels of Aggregation in Bibliometrics
Elements of Bibliometric Analysis are publications and co-authors; units are specific aggregates such as journals, subject categories, and institutions and countries to which papers can be assigned. References (citations) are specific elementary links between papers.
When dealing with patents, inventors and assignees are relevant elements
The distinction between three levels of aggregation is important. Each level of aggregation requires its own methodological and technological approach.
Micro Level: Research output of individuals and research groups; Meso Level: Research output of institutions and scientific journals; Macro level: Research output of regions and countries
Scientometric Techniques
In terms of methodology, Scientometric Technique can be classified into two categories:
One-Dimensional (or scalar) and
Two-Dimensional (or relational technique). One-dimensional techniques are based on direct counts (or occurrences) and graphical representation of specific bibliometric entities (e.g., publications and patents) or particular data –elements in these items, such as citations, keywords or addresses. One-Dimensional Techniques are used to generate scalar indicators for monitoring the state-of-the-S&T system. Scalar indicators are increasingly being exploited for science policy purposes-both as descriptive and diagnostic tools. Two-Dimensional Techniques are based on co-occurrences of specific data-elements, such as number of times the keywords, classification codes, citations and addresses are mentioned together.
One-Dimensional (or scalar) and
Two-Dimensional (or relational technique). One-dimensional techniques are based on direct counts (or occurrences) and graphical representation of specific bibliometric entities (e.g., publications and patents) or particular data –elements in these items, such as citations, keywords or addresses. One-Dimensional Techniques are used to generate scalar indicators for monitoring the state-of-the-S&T system. Scalar indicators are increasingly being exploited for science policy purposes-both as descriptive and diagnostic tools. Two-Dimensional Techniques are based on co-occurrences of specific data-elements, such as number of times the keywords, classification codes, citations and addresses are mentioned together.
Laws in Bibliometrics
Law
|
Definition
|
Example
|
Lotka's law of scientific productivity
|
Law describes the frequency of publication by authors in a given field. “Basically this means that in a given field a very large percentage of authors produce only one paper, fewer authors produce two papers, and so forth. Only a small number of authors produce a substantial number of publications”.
|
Egghe and Rao in an article in the August 2002 issue of JASIST worked on applying Lotka's law in cases where there are multiple authors of a single journal article. They referred to their analysis as fractional frequency distributions.
|
Bradford's law of scatter
|
Serves as a general guideline to librarians in determining the number of core journals in any given field. "The references are scattered throughout all periodicals with a frequency approximately related inversely to the scope.”
| |
Zipf's law of word occurrence
|
To predict the frequency of words within a text. For a given text the rank of a word multiplied by the frequency is a constant. Works well for high frequency words, not so well for low – thus a number of modifications.
|
Science and Technology performance indicators
Scientific Papers
|
Listed Indicators which shows quantitative changes: Measure the productivity
Indicator
|
Further Description
|
Advantage
|
Disadvantage
|
Example
|
Numbers of papers
|
Easy to retrieve
|
Does not say anything
about the impact
|
Ex. 1
| |
Share of the number of papers
|
Ex. 1
| |||
Comparison of research output over the years
|
International comparison of countries by “the degree of contribution to the production of papers in the world”
|
Evolution of research output in different years
|
Ex. 2
| |
Activity in different fields
|
Ex. 3
| |||
Co-authoring[1]
|
International collaboration/ National collaboration/ Department collaboration
|
Shows to what extent an analyzed unit
cooperates with other units in the production of papers
|
Ex. 4
|
Some examples to qualify above mentioned indicators
Research Areas
|
2000-11
| |
Papers
|
Share
| |
Engineering
|
2424670
|
24.3
|
Chemistry
|
1621156
|
16.2
|
Physics
|
1604621
|
16.1
|
Computer Science
|
1274468
|
12.8
|
Materials Science
|
973841
|
9.7
|
Biochemistry Molecular Biology
|
916902
|
9.2
|
Example 4: Multi-authorship pattern of Indian publication activity in nanotechnology
Year
|
Single Author(Share of Publications)
|
Two Authors(Share of Publications)
|
Multi Authors(Share of Publications)
|
2000
|
13(5.28)
|
60(24.39)
|
173(70.33)
|
2005
|
51(4.55)
|
225(20.05)
|
846(75.40)
|
2009
|
103(2.98)
|
718(20.78)
|
2634(76.24)
|
2009
|
103(2.98)
|
718(20.78)
|
2634(76.24)
|
Description: Method – Count the number of articles published by the analyzed unit during the analyzed time spam and check how many that was co-authored together with a selected other unit. Divide the second figure by the first one to get the share of articles co authored between the units.
Where, px = share of publications co-authored with a certain unit
Px = number of publications co-authored with the selected unit
P = total number of publications produced at the analyzed unit during the analyzed time.
[1] For detailed description refer Example 4
Listed Indicators which show qualitative changes: Measure scientific impact
Indicator
|
Further Description
|
Advantage
|
Disadvantage
|
Example
|
Number of citations
|
Gives an indication of the scientific impact
|
Does not take into account that older articles usually are more cited and that
citation rates vary between document types and subject areas
|
Ex. 5
| |
Citations per publication (CPP)
|
Gives an indication of the average scientific impact
|
Citation rates vary between
document types and subject areas
|
Ex. 5
| |
Citations received in the year of publication
|
How fast paper made impact on international community
|
Ex. 5
| ||
Uncited papers
|
The number of papers which did not received citation even once during the time period considered
|
Ex. 5
| ||
Highly cited papers[1]
|
Number of papers that received maximum citations during the research period
|
Requires data from a comprehensive citation database
|
High normalized
citation score can be due to few highly-cited articles---this is not considered
|
Ex. 6
|
Journal Impact Factor (IF)[2]
|
Used to measure the impact of scientific journals where paper is published
|
Ex. 7
| ||
Number of papers in top ranked journals
|
Select journals according to a suitable criterion like Impact factor of the journal
|
Does reflect the potential impact of paper
|
Does not take the size of the analyzed time duration into account
|
Ex. 8
|
Some examples to qualify the above mentioned indicators are:
Example 5: Publications from India: Nanotechnology Scenario
Year
|
Publications
|
Citations
|
Citation per paper
(in the year of publication)
|
Citations received in the year of publication
(Uncited papers in the year of publication; %Uncited)
|
Uncited papers
(%uncited)*
|
2005
|
1072
|
15985
|
14.9 (0.3)
|
295 [777; 72%]
|
127 (12%)
|
2009
|
3086
|
14559
|
4.7 (0.4)
|
1364 [1869;61%]
|
762 (25%)
|
2011
|
5020
|
5260
|
1.0 (0.4)
|
2241 [3806;76% ]
|
2674 (53%)
|
Example 6: Trends in Highly Cited Papers (2011)
Country
|
Total Papers (rank)
|
Top 1% highly cited papers (rank)
|
USA
|
455541 (1)
|
9308 (1)
|
Japan
|
98890 (5)
|
1098 (9)
|
Germany
|
118598 (3)
|
2626 (2)
|
UK
|
102754 (4)
|
2551 (3)
|
France
|
82293 (6)
|
1555 (5)
|
China
|
235639 (2)
|
1943 (4)
|
India
|
55389 (10)
|
319 (20)
|
S. Korea
|
53601 (11)
|
533 (15)
|
Note: In this example the top 1% highly cited papers in year 2011 globally are taken and the presence of different countries is shown by number of papers and their rank relatively.
Example 7: Journal Impact Factor
The 2005 impact factor of the journal Nature is produced by counting the number of citeable publications in Nature during 2005 that cite publications in nature from 2003-2004 and dividing this with the total number of publications in Nature 2003-2004.
Description:
where:
I = the impact factor for journal J in year Y
C = the number of citations from publications in year Y to publications in journal J published Y-2 and Y-1
P = total number of citeable publications in journal J in year Y-2 and Y-1
Example 8: Publication activity in high IF journals
High Impact Journals (IF)
|
USA (Rank)
|
Germany (Rank)
|
France (Rank)
|
China (Rank)
|
S. Korea (Rank)
|
India (Rank)
|
Cancer journal for clinicians (101.78)
|
61.43 (1)
|
11.98 (3)
|
4.73 (5)
|
1.01 (18)
|
0.50 (20)
|
0.30 (24)
|
Annual Reviews of immunology (52.761)
|
67.19 (1)
|
3.56 (3)
|
3.29 (5)
|
0.00 (18)
|
0.26 (20)
|
0.00
|
Reviews of modern physics (43.933)
|
57.50 (1)
|
16.13 (2)
|
8.17 (3)
|
0.62 (27)
|
0.41 (31)
|
1.34 (18)
|
Chemical Reviews (40.197)
|
48.63 (1)
|
9.90 (2)
|
6.72 (3)
|
2.74 (9)
|
0.79 (20)
|
2.33 (10)
|
Identifying Conceptual Connections among documents
Helps to identify papers that address key themes/concepts. Further advanced techniques such as co-citation analysis helps to identify the research front. Co-word analysis helps to show connections among concepts.
Common approach: common keywords among documents
More sophisticated approaches
Bibliographic Coupling and Co-citation analysis
|
Similarity through Matching Reference (Bibliographic Coupling)
A reference in an article reflects one or more concepts upon which the article draws. Two articles that share a common reference (bibliographic coupling) would therefore have some linkage through the shared concept(s), even though the articles themselves might have vastly different terminology. So, searching for linkages among two or more articles through shared references offers a way to identify linking mechanisms
Similarity through indentifying jointly cited papers (Co-Citation)
Co-citation analysis involves tracking pairs of papers that are cited together in the source articles. When the same pairs of papers are co-cited with other papers by many authors, clusters of research begin to form. The co-cited or “core” papers in these clusters tend to share some common theme, theoretical or methodological or both.
Method:
references in a document are identified
relatedness between these references is calculated (how many times two references occurred in the same document)
the references are clustered using a transform of the co-occurrence matrix
finally, the original documents are assigned to these reference clusters
Co word analysis
Co-word analysis is a content analysis technique that uses patterns of co-occurrence of pairs of items (i.e., words or noun phrases) in texts to identify the relationships between ideas within the subject areas presented in the texts.
It is used to identify the relationships between ideas within the subject areas presented in the texts and the strength of relationships between items.
Co-word analysis is also very much similar to co-citation analysis. The only difference is that co-word analysis focuses on words in the document rather than references.
Method: The words or phrases that are important are identified and the relatedness between words is calculated (based on co-occurrence). Finally, the words are clustered and documents are assigned to these word clusters.
What all can be done from Publication analysis: Summary Table
Variables
|
Different Indicators which can be constructed
|
Authors
|
Number in a subject, field, institution, country; growth; correlation with productivity; collaboration - co-authorship, associated networks; author in a subject
|
Origin
|
Rates of production, size, growth by country, institution, language, subject; Correlation with economic & other indicators
|
Sources
|
Journals: Growth, dynamics, numbers; life cycles; quantity/yield distribution; Various distributions by subject, language, country
|
Contents
|
Analysis of texts -- distribution of words, phrases in various parts; subject analysis, co-word analysis
|
Citations
|
Citation indexes, impact factors, co-citation studies etc; Some other analysis - number of references in articles, number of citations to articles, bibliographic coupling; co-citations - author connections, subject structure, networks, maps etc; papers validation with qualitative methods and impact
|
Note: Adopted from Tefko Saracevic study (from Rutgers University)
|
Methodological Problems of bibliometric based Indicator
Many of the problems in construction of bibliometric indicators can be addressed if one has understanding of principles behind construction of indicators. This applies to S&T indicators including bibliometric indicators.
Most of S&T indicators often have little relationship with What they Attempt to Measure?, How those measurements might be carried out and used in Policy Design?, How the Policy Instruments that they create would influence the working of the economic system.
There are limitations which primarily apply to bibliometric based inbdicators. In the context of publication based indicators following limitations are primarily visible: Indicates quantity of output, not quality; Non-journal methods of communication ignored; Publication practices vary across fields, journals, employing institutions; Choice of suitable, inclusive database is problematical; Undesirable publishing practices (artificially inflated number of co-authors; shorter papers); Papers represent only one output of laboratory based activity.
In particular the fact that a paper is less frequently cited or (still) unquoted several years after its publication gives information about its reception by colleagues but does not reveal anything about the quality or standing of its author(s). High degree of citation may indicate that that its content has integrated into the body of knowledge of the respective subject field. Low/no citations may indicate likely that the results involved do not contribute essentially to the contemporary scientific paradigm system of the subject field in question. Similarly, major concerns in using citations were identified as:
Intellectual link between citing source and reference article may not always exist; Incorrect work can be highly cited; Methodological papers among most highly cited; Citations lost in automated searches due to spelling differences and inconsistencies; Similar to publication practices, citations vary across fields, journals, employing institutions; SCI and Scopus source in which citations are available changes over time; SCI and Scopus is biased over English language journals; Works of great importance rapidly become part of a common knowledge and are thus referred to in the literature without citation.
Citations may be critical rather than positive : however it has been argued that even contested results make a contribution to knowledge; The various scientific fields are cultivated by groups of varying size, and thus the probability of being cited varies from sector to sector; The number of citations does not follow a linear rate in the course of time; The value of scientific work is not always acknowledged by contemporaries:
It is important to understand limitations of indicators based on publication and citation count. There is a tendency to make claims that are questionable
Bibliometrics is continuously improving: Normalization / comparability of indicators; Self-citations, fractional counting; Data standardization; Coverage of the different outputs; Monitoring deficiencies & manipulation.
No comments:
Post a Comment