Relevance of document types in the scores’ calculation of a specific field-normalized indicator: Are the scores strongly dependent on or nearly independent of the document type handling?
Scientometrics
https://doi.org/10.1007/s11192-022-04446-y
Relevance of document types in the scores’ calculation
of a specific field‑normalized indicator: Are the scores
strongly dependent on or nearly independent
of the document type handling?
Robin Haunschild1
· Lutz Bornmann1,2
Received: 17 May 2021 / Accepted: 20 June 2022
© The Author(s) 2022
Abstract
Although it is bibliometric standard to employ field normalization, the detailed procedure
of field normalization is not standardized regarding the handling of the document types.
All publications without filtering the document type can be used or only selected document types. Furthermore, the field-normalization procedure can be carried out with regard
to the document type of publications or without. We studied if the field-normalized scores
strongly depend on the choice of different document type handlings. In doing so, we used
the publications from the Web of Science between 2000 and 2017 and compared different field-normalized scores. We compared the results on the individual publication level,
the country level, and the institutional level. We found rather high correlations between
the different scores but the concordance values provide a more differentiated conclusion:
Rather different scores are produced on the individual publication level. As our results on
the aggregated levels are not supported by our results on the level of individual publications, any comparison of normalized scores that result from different procedures should
only be performed with caution.
Keywords Scientometrics · Bibliometrics · Document type · Field normalization
Introduction
According to one of a total of ten principles in the Leiden manifesto for the professional
application of bibliometrics in research evaluation, field-normalized scores should be used
instead of simple citation counts (Hicks et al., 2015). The citation impact of individual
* Robin Haunschild
Lutz Bornmann
1
Max Planck Institute for Solid State Research, Heisenbergstr. 1, 70569 Stuttgart, Germany
2
Science Policy and Strategy Department, Administrative Headquarters of the Max Planck Society,
Hofgartenstr. 8, 80539 Munich, Germany
13
Vol.:(0123456789)
Scientometrics
publications from the same year and scientific field is reflected by such scores. Whereas
this basic principle of field normalization has been emerged as standard procedure in bibliometrics, specific elements of the procedure are unclear or are applied differently in bibliometrics (e.g., the use of the categorization system to define fields). One of these elements is
how the document type should be handled during the normalization procedure. The Leiden
Ranking (Waltman et al., 2012) and the SCImago Institutions Ranking (SIR)—two popular
institutional rankings—include different types of publications: The Leiden Ranking currently (CWTS, 2022) includes only the document types ‘Article’ and ‘Review’ whereas
the SIR additionally considers the document types ‘Conference Paper’ and ‘Short Survey’.
Both rankings consider the document type when calculating field-normalized scores.
InCites—a citation-based research analytics tool evaluating institutional productivity—
includes all document types and normalizes with respect to them separately (Clarivate Analytics, 2021). A similar procedure is applied in SciVal (Elsevier, 2019)—a tool that is very
similar to InCites. It seems to be a given for major rankings and tools that documents of
different types are treated separately in normalization, although there is to the best of our
knowledge no study yet that investigates this effect. Some databases, for example Microsoft
Academic Graph (Scheidsteger et al., 2018; Sinha et al., 2015; Wang et al., 2020), its successor OpenAlex (OurResearch, 2021; Priem et al., 2022), or Dimensions (Herzog et al.,
2020), do not distinguish between ‘Article’, ‘Review’, ‘Letter’, ‘Note’, ‘Editorial material’,
etc. (which makes the consideration impossible). Another practical necessity might prevent
the consideration of document types during normalization procedures: If fields with rather
few publications are separated not only by publication year but also by document type, this
might lead to too small reference sets for normalization procedures (leading to unreliable
results). Furthermore, the assignment of document types is inconsistent between different
databases (i.e., an ‘Article’ in Web of Science (WoS) might be a ‘Review’ in Scopus).
Many studies in bibliometrics have shown that publications of different document
types not only gather a different average number of citations but also gather their citations at different speeds (see, e.g., Wang, 2013). Based on previous research on the Journal
Impact Factor (JIF, provided by Clarivate Analytics), we hypothesized that the handling
of the document type in calculating field-normalized scores will lead to different results
in research evaluation. In the calculation of the JIF, document types contribute differently
to the citations a journal receives (Clarivate Analytics, 2021; Van Leeuwen et al., 1998).
Glänzel & Moed (2002) list five factors that may influence the JIF. One of these factors is
the document type of a publication, i.e., the distribution of publications across document
types in a journal. In previous research on field-normalized indicators, Nederhof & Visser (2004) analyzed in a case study the change in average field-normalized citation scores
(significant increases of the indicator between 1989–1993 and 1994–1998) of two Dutch
universities. They found a changed document type handling in the two time periods as one
reason for the significant increase of indicator values.
In this study, for verifying our hypothesis and possibly generalizing the results of the
case study by Nederhof & Visser (2004), we compare field-normalized citation scores,
which have been calculated based on three different ways of handling document types in
Table 1 Differences and
commonalities of the three
datasets used in this study
13
dt0
dt1
dt2
Number of papers
34,929,708
26,766,770 26,766,770
Number of document types
Document type included in
normalization procedure?
35
No
4
Yes
4
No
Scientometrics
the normalization procedure. We are interested whether they lead to the same, similar, or
different scores for the same papers—if everything else (i.e., the formula for calculating
the scores and the field classification) remains unchanged. This is an important question
for the use and interpretation of field-normalized scores in research evaluation: the scores
are calculated in different ways in the concrete research evaluation practice (see above). If
different document type handling leads to different scores, field-normalized scores from
different sources should only be compared with caution—although the field-normalized
indicator (and used field-categorization scheme) is the same.
Methods
Data set
We used a custom database developed and maintained by the Competence Cente (...truncated)