Identifying “hot papers” and papers with “delayed recognition” in large-scale datasets by using dynamically normalized citation impact scores
Identifying ''hot papers'' and papers with ''delayed recognition'' in large-scale datasets by using dynamically normalized citation impact scores
Lutz Bornmann 0 1 2
Adam Y. Ye 0 1 2
Fred Y. Ye 0 1 2
0 Jiangsu Key Laboratory of Data Engineering and Knowledge Service, Nanjing University , Nanjing 210023 , China
1 Center for Bioinformatics, School of Life Sciences, Peking University , Beijing 100871 , China
2 Division for Science and Innovation Studies, Administrative Headquarters of the Max Planck Society , Hofgartenstr. 8, 80539 Munich , Germany
''Hot papers'' (HPs) are papers which received a boost of citations shortly after publication. Papers with ''delayed recognition'' (DRs) received scarcely impact over a long time period, before a considerable citation boost started. DRs have attracted a lot of attention in scientometrics and beyond. Based on a comprehensive dataset with more than 5,000,000 papers published between 1980 and 1990, we identified HPs and DRs. In contrast to many other studies on DRs, which are based on raw citation counts, we calculated dynamically field-normalized impact scores for the search of HPs and DRs. This study is intended to investigate the differences between HPs (n = 323) and DRs (n = 315). The investigation of the journals which have published HPs and DRs revealed that some journals (e.g. Physical Review Letters and PNAS) were able to publish significantly more HPs than other journals. This pattern did not appear in DRs. Many HPs and DRs have been published by authors from the USA; however, in contrast to other countries, authors from the USA have published statistically significantly more HPs than DRs. Whereas ''Biochemistry & Molecular Biology,'' ''Immunology,'' and ''Cell Biology'' have published significantly more HPs than DRs, the opposite result arrived for ''Surgery'' and ''Orthopedics.'' The results of the analysis of certain properties of HPs and DRs (e.g. number of pages) suggest that the emergence of DRs is an unpredictable process.
Hot paper; Paper with delayed recognition impact scores
In most evaluations of researchers, research groups, and academic institutions, bibliometric
indicators—especially citation impact scores—are used in an informed peer review process
(Bornmann et al. 2014)
. A frequent problem of the application of citation impact scores in
these processes is that the evaluations focus—as a rule—on the recent performance of the
evaluated units (e.g. the last 3 years). However, the ‘‘true’’ impact of a publication can be
determined only after a longer time period in several disciplines: ‘‘A 3-year time window is
sufficient for the biomedical research fields and multidisciplinary sciences, while a 7-year
time window is required for the humanities and mathematics’’
(Wang 2013, p. 866)
the strength of bibliometrics entails identifying outstanding publications (or the
corresponding outstanding researchers, research groups, and institutions, respectively) in the
In recent years, several bibliometric studies have dealt with the investigation of a
subgroup of publications showing a specific long term citation impact: papers with delayed
recognition (DRs). Publications are denoted as DRs if they received only a few or no
citations over many years (e.g., 10 years after their appearance) and then experienced a
significant boost in citations. For example, Van Calster (2012) shows that Charles Sanders
Peirce (1884) note in Science on ‘‘The Numerical Measure of the Success of Predictors’’ is
a typical case of a DR. The note received ‘‘less than 1 citation per year in the decades prior
to 2000, 3.5 citations per year in the 2000s, and 10.4 in the 2010s’’ (p. 2342). Marx (2014)
demonstrates that the initial reception of the paper ‘‘Detailed Balance Limit of Efficiency
of P–N Junction Solar Cells’’ by Shockley and Queisser (1961) was hesitant; after several
years, the paper has become a highly cited paper in its field.
Gorry and Ragouet (2016)
present a landmark paper in interventional radiology, which can be characterized as a DR.
In ‘‘Literature review’’ section, we explain the different methods which have been
introduced in scientometrics to identify these and other DRs in bibliometric databases.
Based on these methods,
Ye and Bornmann (2018)
propose the citation angle, which can
be used to distinguish between ‘‘hot papers’’ (HPs) and DRs. In contrast to DRs, HPs
received a boost of citations shortly after publication (and not after several years as DRs).
In this study, we searched for HPs and DRs among all papers published between 1980 and
1990. Since citation counts should be normalized with regard to publication year and
subject category (of the cited publication), we generated dynamically normalized citation
impact scores (DNIC), which are annually field-normalized impact scores based on OECD
minor codes1 for field delineation. We used these scores for the search of HPs and DRs.
The objective of this study is to analyze systematic differences between papers which
became HPs or DRs later on. Factors which have been identified in recent years as
correlates of citations
(Bornmann & Leydesdorff 2017; Tahamtan, Safipour Afshar, &
are used to determine different characteristics of both paper groups.
As factors, this study focuses on the publication year, the number of authors, countries,
references and pages of a publication as well as its inter-disciplinarity (measured by the
number of subject categories).
1 see http://www.oecd.org/science/inno/38235147.pdf.
where ct is the citation counts received in the tth year after publication and t the age of a
paper. A paper reached the maximum number cm of annual citations at time tm. The
equation of the straight line (l) which connects two points (0, c0) and (tm, cm) in the annual
citation curve is defined as
l : c ¼
t þ c0:
assumes that the coefficient B is an elegant and effective method for
DRs retrievals in big datasets.
Ye and Bornmann (2018)
reveal its dynamic characteristics
and extend B by a HP component. Furthermore, they introduced the citation angle for
unifying the approaches of identifying instant and delayed recognition. The distinction
between DRs and HPs follows
Baumgartner and Leydesdorff (2014)
who introduced two
groups of papers: (1) ‘‘Citation classics’’ or ‘‘sticky knowledge claims’’ have a lasting
impact on a specific field. DRs are a sub-group among citation classics, whose lasting
impact is not combined with early citation impact. (2) The other paper group (‘‘transient
knowledge claims’’) has an early boost of citation impact followed by a fast impact
decrease shortly after publication. According to
Baumgartner and Leydesdorff (2014)
papers in this group are contributions at the research front.
Comins and Leydesdorff (2016)
investigated the existence of both paper types empirically.
van Raan (2015) demonstrated that many DRs are application-oriented and thus are
potential ‘‘sleeping innovations’’. In a follow-up study, van Raan (2016) analyzed
characteristics of DRs which are cited in patents. The results show that patent citations occur
before or after the delayed recognition started. The citation rate during the period of sleep
is not related to the later scientific or technological impact of the DRs. The comparison of
DRs with ‘‘normal’’ papers reveals that DRs are more frequently cited in patents than
Definitions of ‘‘hot papers’’ (HP) and papers with ‘‘delayed recognition’’ (DRs)
Following the definitions of HPs and DRs hitherto, the typical DR is defined as a
publication with a late citation peak, and prior annual citations which are much lower than the
peak citations, while a typical HP is defined as a publication with an early citation peak and
later annual citations which are much lower than the early peak. In contrast to the other
studies, which used raw citation counts to identify DRs (see ‘‘Literature review’’ section),
this study is based on (dynamically) field- and time-normalized citation impact scores—the
standard impact measure in bibliometrics
. The dynamically normalized
impact of citations (DNIC) is defined as
DNICij ¼ ECkijj ;
k ¼ f ðiÞ
Ekj ¼ Nkj ijk¼f ðiÞ
where i = 1,2,… are publications, j = 1,2,… are citing years, and k = 1,2,… are different
fields (here defined by OECD minor codes). Cij denotes received citations by publication
i in year j, and Ekj denotes mean (received) citations of all publications in field k and year
j (i.e. Ekj is the expected value). Nkj is the number of cited publications in field k and year
j (note: Nkj is a variable which is based on non-zero citations), and k = f(i) means a certain
field of a given publication. The indicator follows the standard approach in bibliometrics
with both field- and time-normalized citations
. The only difference to the
standard approach is that the calculation is based on annual citations (dynamically), but not
on the citations between publication year and a fixed time point later on. If Cij = 0, then
DNICij = 0.
All points of DNICij = 1 in field k yield the field- and time- normalized line LN (see the
distribution in theory of DNIC in Fig. 1). If DNICij [ 1, the citation impact of the
publications is higher than the average in the corresponding fields and publication years, as
shown with line LA. If DNICij \ 1, the impact is lower than the average, as shown with the
line LU. In practical terms, however, citation counts Cij and expected values Ekj are variable
terms. The DNIC distribution of many papers changes from year to year (see the
distribution in practice in Fig. 1). Therefore, by using DNIC for impact normalization of papers
in this study we need rules for identifying HPs and DRs. We oriented these rules towards
the rules of thumbs defined by van Raan (2004a, 2008) for interpreting field-normalized
citation scores. DNICij is a dynamic series of annually normalized impact scores. We
suggest identifying HPs and DRs with the criteria given in Table 1.
In Table 1, DNICpeak_t\th denotes that the peak is located in the early-half time span of
the citation impact distribution (covering ± 2 years); DNICpeak_t[th denotes that the peak
is located in the late-half time span (covering ± 2 years). DNICa_peak_t refers to all DNICij
after the peak (? 2 years), and DNICb_peak_t refers to all DNICij before the peak
(- 2 years). In this study, th = 13. We have data covering 36 citing years (1980–2015) and
needed to compare the years 1980–1990 dynamically. Thus, we selected 16 years as the
time span of citations for each publication, such as 1980–1995 for the papers from 1980
and 1981–1996 for the papers from 1981.
Thomson Reuters). From the in-house database, we selected only papers with the document
type ‘‘article’’ to have comparable citable units. The DNIC scores for each paper refer to
the period from its publication year until the end of 2015.
Using the methods explained in ‘‘Definitions of ‘hot papers’ (HP) and papers with
‘delayed recognition’ (DRs)’’ section, we found the numbers of HPs and DRs in the dataset
as reported in Table 2. Since HPs and DRs have been identified by using normalized
impact scores within single fields and many papers belong to more than one field, there are
duplicates among HPs and DRs. Thus, 191 duplicates were deleted of the 2636 DRs and
HPs (147 papers were twice and 44 papers three times in the dataset). Figure 2
demonstrates clear differences in citation profiles of HPs and DRs following the definitions of
both groups in ‘‘Definitions of ‘hot papers’ (HP) and papers with ‘delayed recognition’
Both, HPs and DRs are groups of papers with extreme citation profiles (see Fig. 2). In
order to reveal how these extreme groups differ from ‘‘normal’’ papers in certain
properties, we drew a random sample from the in-house database with n = 323 papers (date
December 8, 2016). The random sample has been selected in those WoS subject categories
in which most of the DRs and HPs were published (i.e., the ten subject categories in which
most of the DRs and HPs were published). The population of the random sample
(N = 1,198,843) contains papers from 1980 to 1990 and is restricted to the document type
‘‘article’’. The size of the random sample with n = 323 papers has been determined by a
power analysis. Its results showed that we need 323 papers in each group to detect a very
small effect, f = .1
, as statistically significant at the a = .05 level with a
power of .8
Considering the third group of randomly selected papers (RANs), the dataset (n = 2768)
of this study consists of 2130 HPs (77%), 315 DRs (11%), and 323 RANs (12%). In order
to have three groups of papers with a more or less balanced set of case numbers, we drew a
random sample of 323 papers from the 2130 HPs—following the results of the power
analysis. Thus, the final dataset (n = 961) consists of 323 HPs (33.6%), 315 DRs (32.8%),
and 323 RANs (33.6%).
This study tests whether the mean values (e.g., the mean number of authors or pages) from
k groups (HP, DR, and RAN) are the same or not. With the analysis of variance (ANOVA)
any overall difference between the k groups can be tested on statistical significance. The
ANOVA separates the variance components into two parts: those due to mean differences
and those due to random influences (Riffenburgh, 2012). There are three general
assumptions for calculating the ANOVA: (1) The data are independent of each other. (2)
The distribution of the data is normal. (3) The standard deviation of the data is the same for
all groups (HP, DR, and RAN). Although these assumptions are violated here, the ANOVA
is still applied: according to Riffenburgh (2012), the ANOVA ‘‘is fairly robust against
these assumptions’’ (p. 265), especially in those studies in which the sample size is high. In
order to counter-check the results of the ANOVA, the Kruskal–Wallis rank test (KW test)
has been additionally applied as the non-parametric alternative
The effect size eta squared (g2) is additionally calculated to the ANOVA which is a
measure of the practical significance of the results
. Eta squared is the sum of
squares for a factor (here: three groups of papers with different citation profiles) divided by
the total sum of squares. The effect size shows how much of the variation in the sample of
papers (e.g. with respect to the number of authors) is explained by the factor. According to
, a value of g2 = .01 means a small effect, g2 = .06 a medium effect, and
g2 = .14 a large effect. The consideration of the practical significance is especially
important in studies in which the case numbers are high
. There is a risk in
these studies that the results of statistical tests are significant although the effects (e.g.,
mean differences between k groups) are small.
Beyond the ANOVA, the t test is applied in this study to undertake pairwise
comparisons of group means. Thus, it is not only tested whether the mean differences between all
k groups (where k [ 2) are statistically significant, but also the mean differences between
the specific pairs of groups. The t test is seen as a very robust statistic; for the t test,
however, the same assumptions hold as for the ANOVA (see above). Since the
assumptions are not fulfilled in each calculation here, the non-parametric alternative referred to as
the Mann–Whitney two-sample rank-sum test is additionally used
multiple pairwise comparisons, the chance of the likelihood of incorrectly rejecting the null
hypothesis increases. Thus, the Bonferroni correction is used which compensates for that
by testing each pairwise comparison at a significance level of .05/3 = .017 (.05 is the alpha
level and 3 is the number of pairwise comparisons). As a measure of effect size in addition
to the t test, Cohen’s d is applied. For
, d = .2 is a small effect, d = .5 a
moderate effect, and d = .8 a large effect.
The Chi Square test of independence is used in this study to determine if there is a
significant association between two nominal (categorical) variables. The frequency of a
specific nominal variable is compared with different values of a second nominal variable.
The required data can be shown in an R*C contingency table, where R is the row and C is
Factors with an influence on citation counts (FICs)
In recent years, many different factors have been identified which may influence the
number of citations a publication receives. Although these factors turn out to be correlated
with citations and causality cannot be assumed
(Bornmann and Leydesdorff 2017)
, they are
generally considered to be influencing factors. On a given time axis, the citations follow
the appearance of a publication with specific characteristics (e.g., a specific number of
authors or pages). However, one should have in mind for this perspective on the factors
that moderating factors might exist. For example, the JIF might count as FIC; however,
high citation counts for papers published in high-impact journals could be the result of the
Before we come in ‘‘Factors with an influence on citation counts (FICs)’’ section to the
FICs and their relationship to HPs and DRs, we show in ‘‘Publishing journals and overall
citation impact’’ section possible differences between both groups concerning their
publishing journal and overall citation impact.
Publishing journals and overall citation impact
Factors with an influence on citation counts (FICs)
With publication year, number of pages, number of references, number of authors, number
of countries, and number of subject categories, factors are considered here, which have
been (frequently) investigated in former studies. Overviews on studies investigating FICs
Journal of Biological Chemistry
Physical Review Letters
Journal of Immunology
Clinical Orthopaedics and Related Research
Number of papers
F(2, 960) = 125.61, p = .000, g2 = .21 [.16, .25], v2(2) = 518.91, p = .000
at(1, 636) = - 9.12, p = .000, d = - .72 [- .88, - .56], z = - 12.09, p = .000
bt(1, 644) = 9.92, p = .000, d = .78 [.62, .94], z = 17.46, p = .000
ct(1, 636) = 12.96, p = .000, d = 1.03 [.86, 1.19], z = 19.07, p = .000
F(2, 960) = 81.59, p = .000, g2 = .15 [.11, .19], v2(2) = 421.25, p = .000
at(1, 636) = - 8.38, p = .000, d = - .66 [- .82, - .50], z = - 13.48, p = .000
bt(1, 644) = 4.95, p = .000, d = .39 [.23, .55], z = 12.51, p = .000
ct(1, 636) = 10.00, p = .000, d = .79 [.63, .95], z = 17.80, p = .000
can be found in Peters and van Raan (1994), Onodera and Yoshikane (2014),
Bornmann and Daniel (2008)
. The results of the studies indicate that
publication year, number of pages, number of references, number of authors, number of
countries, and number of subject categories are regarded as possible FICs.
The first FIC which we look at in this study is the publication year of the cited paper
(Ruano-Ravina and Alvarez-Dardet 2012). Besides the journal or field, respectively, in
which a publication appeared the publication year is generally considered in the
normalization of citations
. Since DRs emerge in the long term, we expected an
earlier mean publication year for DRs than for HPs. However, the results in Table 5 show
that the empirical evidence looks differently: With M = 1985.2, HPs have been published
similarly on average as DRs (M = 1985.6). Furthermore, the differences between the three
groups (HP, DR, and RAN) are statistically not significant and the effect sizes are very low.
The negligible differences in Table 5 are certainly the result of the use of normalized
impact scores for the identification of HPs and DRs. Thus, the results in the table confirm
the effectiveness of the normalization procedure used in this study.
Table 6 shows the differences in the number of pages between HPs, DRs, and RANs.
DRs (M = 9.6, MDN = 8) have more pages than HPs (M = 8.2, MDN = 7) and RANs
(M = 7.3, MDN = 6). However, the reported effect sizes in the table are small in general.
F(2, 957) = 40.15, p = .000, g2 = .08 [.05, .11], v2(2) = 104.85, p = .000
at(1, 636) = 6.80, p = .000, d = .54 [.38, .70], z = 9.37, p = .000
bt(1, 643) = 6.38, p = .000, d = .50 [.35, .66], z = 8.61, p = .000
ct(1, 635) = - 1.02, p = .31, d = - .08 [- .24, .07], z = - .64, p = .52
and in agreement with most other country-specific statistics including all publications
(National Science Board 2016). It follows Great Britain (n = 76), Japan (n = 42), and
Germany (n = 39). The USA is the only country in Table 9 with a statistically significant
difference in the number of HPs and DRs: With n = 194, significantly more HPs have been
published by authors from the USA than DRs (with n = 139).
Table 10 shows mean differences in number of countries between HPs, DRs, and
RANs. We tested the mean difference since there are evidences that the number of
countries is related to the number of citations (see above). However, our results in Table 10
F(2, 943) = 4.24, p = .02, g2 = .01 [.000, .02], v2(2) = 1.63, p = .44
at(1, 632) = 1.85, p = .06, d = .15 [- .01, .30], z = 1.49, p = .14
bt(1, 633) = 2.69, p = .01, d = .21 [.06, .37], z = 2.35, p = .02 (n.s.)
ct(1, 621) = .18, p = .35, d = .08 [- .08, .23], z = .87, p = .38
reveal that the number of countries does not discriminate between the three groups. The
practical significances are small.
As a last FIC in this study, we investigated the number of subject categories. The
number of subject categories for a paper can be used as an indicator of inter-disciplinarity.
We used the WoS subject categories which have been assigned by Clarivate Analytics to
the papers on the base of the publishing journals. Table 11 shows the mean differences in
number of subject categories between HPs, DRs, and RANs. As the results reveal, the
differences are of no practical relevance.
Table 12 reports the ten WoS subject categories with the most HPs and DRs:
‘‘Biochemistry & Molecular Biology’’ (n = 68) and ‘‘Physics, Multidisciplinary’’ (n = 42) are
those categories where most of the papers from both groups belong to. Also, the
table reports the results of statistical significance tests for subject category differences
between HPs and DRs. There are five statistically significant results. ‘‘Biochemistry &
Molecular Biology’’ (HP = 59, DR = 9), ‘‘Immunology’’ (HP = 34, DR = 6), and ‘‘Cell
Biology’’ (HP = 22, DR = 4) published more HPs than DRs. In contrast, the subject
categories ‘‘Surgery’’ (HP = 3, DR = 37) and ‘‘Orthopedics’’ (HP = 0, DR = 33) are stronger
related to DRs than to HPs.
Discussion and conclusions
The existence of DRs has attracted a lot of attention in scientometrics and beyond. The
people are fascinated by the fact that researchers publish results which are in advance of
one’s time. Studies on DRs dealt either with specific cases of DRs
(e.g., Marx 2014)
with methods of detecting DRs (e.g., Ke et al. 2015). Also, citation profiles showing other
typical distributions than HPs have been proposed. For example,
Ye and Bornmann (2018)
define the citation angle distinguishing between HPs and DRs. HPs are highly-cited
initially, but the impact decreases quickly. Based on a comprehensive dataset of papers
published between 1980 and 1990, we searched for HPs and DRs for further analyses in
this study. In contrast to many other studies on DRs, we calculated DNIC values and used
these scores for the search of HPs and DRs instead of raw citation counts. In this study, we
were interested in identifying systematic differences between HPs and DRs.
F(2, 958) = 4.88, p = .01, g2 = .01 [.001, .03], v2(2) = 7.05, p = .03 (n.s.)
at(1, 636) = - 3.10, p = .002, d = - .25 [- .40, - .09], z = - 2.78, p = .006
bt(1, 644) = - .93, p = .35, d = - .07 [- .23, .08], z = - .29, p = .78
ct(1, 636) = 2.04, p = .04 (n.s.), d = .16 [.01, .32], z = 2.40, p = .0165
The table shows absolute and relative numbers as well as Pearson v2 values with Bonferroni-adjusted
p values (statistically significant results are printed in bold)
The investigation of several variables brought about some interesting results. Since this
is the first study investigating differences between HPs and DRs, the results cannot be
compared with those of other studies. The investigation of the journals which have
published HPs and DRs revealed that some journals (e.g. Physical Review Letters and PNAS)
were able to publish significantly more HPs than other journals. This pattern did not appear
in DRs in this study. Here, the distribution of papers across journals is similar to that in a
However, this result does not agree to the results of van Raan (2015). He found specific
patterns also for DRs. He identified institutions (e.g. MIT) that have more DRs than can be
expected based on their relative contribution to the field (in his case: physics). The same
was found for journals, particularly Physical Review B and Nuclear Physics B. Based on
the results, van Raan (2015) stated that ‘‘a new and interesting question arises whether this
type of observations could say something about institutions which are more prone than
other institutions to accepting (and publishing) out-of-the-box work’’.
In terms of the MNCS (based on single journals or fields), HPs and DRs received impact
scores which are significantly above average. However, the citation impact of the DRs is
significantly higher than that of the HPs. Many HPs and DRs have been published by
authors from the USA; however, in contrast to other countries, authors from the USA have
published statistically significantly more HPs than DRs. For other countries, the differences
between HPs and DRs are statistically not significant. The WoS subject categories in which
the most HPs and DRs have been published are ‘‘Biochemistry & Molecular Biology’’ and
‘‘Physics, Multidisciplinary.’’ Whereas ‘‘Biochemistry & Molecular Biology,’’
‘‘Immunology,’’ and ‘‘Cell Biology’’ have published significantly more HPs than DRs, the
opposite result arrived for ‘‘Surgery’’ and ‘‘Orthopedics.’’ The investigation of HPs and
DRs with regard to FICs (e.g., the number of authors) show that HPs have significantly
more authors and more (linked) references than DRs/RANs.
The results of this study indicate that especially HPs are differently with respect to
certain properties from RANs (e.g. the number of authors), but not necessarily DRs. Our
results suggest therefore that the emergence of DRs is an unpredictable process which
cannot be fixed by certain properties of the papers. With HPs, this prediction might be
possible to a certain extent
(Yu et al. 2014)
. However, this study was a first initial step of
analyzing HPs and DRs in comparison. It would be interesting, if future studies address the
topic of differences between both groups by using data from other bibliometric databases
(especially subject specific databases, as the chemistry-related CA database or the
economics RePEc database). These studies could investigate similar variables as those in this
study in order to test whether the results of this study can be confirmed. The inclusion of
additional variables could reveal further insights in both phenomena: HPs and DRs. Of
special interest are variables which cannot be gathered in WoS. So, it could be tested
whether the publication of HPs and DRs are related to certain characteristics of authors
(e.g. their gender or nationality) or their institutions. Are there certain groups of authors
which have published more DRs in the past than can be expected?
In this study, we used field-normalized scores to identify HPs and DRs. Many papers in
the WoS database do not only belong to one but so several fields. Thus, it would be
interesting to identify those papers in future studies, which are ‘‘normal’’ in one field, but
DRs or HPs, respectively, in another.
Acknowledgements Open access funding provided by Max Planck Society. We acknowledge the National
Natural Science Foundation of China Grant No. 71673131. We thank Simon S. Li for support in program
coding and computing. The bibliometric data used in this paper are from an in-house database developed and
maintained by the Max Planck Digital Library (MPDL, Munich) and derived from the Science Citation
Index Expanded (SCI-E), Social Sciences Citation Index (SSCI), Arts and Humanities Citation Index
(AHCI) prepared by Clarivate Analytics, formerly the IP & Science business of Thomson Reuters.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution,
and reproduction in any medium, provided you give appropriate credit to the original author(s) and the
source, provide a link to the Creative Commons license, and indicate if changes were made.
Acock , A. C. ( 2016 ). A gentle introduction to Stata (5th ed .). College Station: Stata Press.
Baumgartner , S. E. , & Leydesdorff , L. ( 2014 ). Group-based trajectory modeling (GBTM) of citations in scholarly literature: Dynamic qualities of ''transient'' and ''sticky knowledge claims'' . Journal of the Association for Information Science and Technology , 65 ( 4 ), 797 - 811 . https://doi.org/10.1002/asi. 23009.
Beaver , D. B. ( 2004 ). Does collaborative research have greater epistemic authority? Scientometrics , 60 ( 3 ), 399 - 408 .
Bornmann , L. , Bowman , B. F. , Bauer , J. , Marx , W. , Schier , H. , & Palzenberger , M. ( 2014 ). Bibliometric standards for evaluating research institutes in the natural sciences . In B. Cronin & C. Sugimoto (Eds.), Beyond bibliometrics: harnessing multidimensional indicators of scholarly impact (pp. 201 - 223 ). Cambridge: MIT Press.
Bornmann , L. , & Daniel , H. -D. ( 2008 ). What do citation counts measure? A review of studies on citing behavior . Journal of Documentation , 64 ( 1 ), 45 - 80 . https://doi.org/10.1108/00220410810844150.
Bornmann , L. , & Leydesdorff , L. ( 2017 ). Skewness of citation impact data and covariates of citation distributions: A large-scale empirical analysis based on Web of Science data . Journal of Informetrics , 11 ( 1 ), 164 - 175 .
Cohen , J. ( 1988 ). Statistical power analysis for the behavioral sciences (2nd ed .). Hillsdale: Lawrence Erlbaum Associates, Publishers.
Comins , J. A. , & Leydesdorff , L. ( 2016 ). Identification of long-term concept-symbols among citations: Can documents be clustered in terms of common intellectual histories? Retrieved January 5 , 2016 , from http://arxiv.org/abs/1601.00288.
Costas , R., van Leeuwen, T. N. , & van Raan , A. F. J. ( 2010 ). Is scientific literature subject to a 'Sell-ByDate'? A general methodology to analyze the 'durability' of scientific documents . Journal of the American Society for Information Science and Technology , 61 ( 2 ), 329 - 339 . https://doi.org/10.1002/asi. 21244.
Cressey , D. ( 2015 ). 'Sleeping beauty' papers slumber for decades. Research identifies studies that defy usual citation patterns to enjoy a rich old age . Retrieved April 26 , 2016 , from http://www.nature.com/news/ sleeping-beauty -papers-slumber-for-decades- 1 . 17615 .
Didegah , F. , & Thelwall , M. ( 2013 ). Determinants of research citation impact in nanoscience and nanotechnology . Journal of the American Society for Information Science and Technology , 64 ( 5 ), 1055 - 1064 . https://doi.org/10.1002/asi.22806.
Fok , D. , & Franses , P. H. ( 2007 ). Modeling the diffusion of scientific publications . Journal of Econometrics , 139 ( 2 ), 376 - 390 . https://doi.org/10.1016/j.jeconom. 2006 . 10 .021.
Garfield , E. ( 1970 ). Would Mendel's work have been ignored if the Science Citation Index was available 100 years ago ? Essays of an Information Scientist , 1 , 69 - 70 .
Garfield , E. ( 1980 ). Premature discovery or delayed recognition-why . Current Contents , 21 , 5 - 10 (Reprinted in: Garfield, E. Essays of an information scientist . Philadelphia: ISI Press, 1979 - 1980 , Vol. 4 , 488 - 493 ).
Garfield , E. ( 1989a ). Delayed recognition in scientific discovery-citation frequency-analysis aids the search for case-histories . Current Contents , 23 , 3 - 9 .
Garfield , E. ( 1989b ). More delayed recognition. 1. Examples from the genetics of color-blindness, the entropy of short-term-memory, phosphoinositides, and polymer rheology . Current Contents , 38 , 3 - 8 .
Garfield , E. ( 1990 ). More delayed recognition. 2. From inhibin to scanning electron-microscopy . Current Contents , 9 , 3 - 9 .
Gillmor , C. S. ( 1975 ). Citation characteristics of JATP literature . Journal of Atmospheric and Terrestrial Physics , 37 ( 11 ), 1401 - 1404 .
Gla ¨nzel, W. , & Garfield , E. ( 2004 ). The myth of delayed recognition . Scientist , 18 ( 11 ), 8 .
Gla ¨nzel, W., Schlemmer , B. , & Thijs , B. ( 2003 ). Better late than never? On the chance to become highly cited only beyond the standard bibliometric time horizon . Scientometrics , 58 ( 3 ), 571 - 586 .
Gorry , P. , & Ragouet , P. ( 2016 ). ''Sleeping beauty'' and her restless sleep: Charles Dotter and the birth of interventional radiology . Scientometrics , 107 ( 2 ), 773 - 784 . https://doi.org/10.1007/s11192-016-1859-8.
Haustein , S. , Larivie`re, V., & Bo¨rner, K. ( 2014 ). Long-distance interdisciplinary researchleads to higher citation impact . In P. Wouters (Ed.), Proceedings of the science and technology indicators conference 2014 Leiden ''Context Counts: Pathways to Master Big and Little Data'' (pp. 256 - 259 ). Leider, The Netherlands: University of Leiden.
Hegarty , P. , & Walton , Z. ( 2012 ). The consequences of predicting scientific impact in psychology using journal impact factors . Perspectives on Psychological Science , 7 ( 1 ), 72 - 78 . https://doi.org/10.1177/ 1745691611429356.
Huang , T. C. , Hsu , C. , & Ciou , Z. J. ( 2015 ). Systematic methodology for excavating sleeping beauty publications and their princes from medical and biological engineering studies . Journal of Medical and Biological Engineering , 35 ( 6 ), 749 - 758 . https://doi.org/10.1007/s40846-015-0091-y.
Iribarren-Maestro , I. , Lascurain-Sanchez , M. L. , & Sanz-Casado , E. ( 2007 ). Are multi-authorship and visibility related? Study of ten research areas at Carlos III University of Madrid. In D. Torres-Salinas & H. F. Moed (Eds.), Proceedings of the 11th conference of the international society for scientometrics and informetrics (Vol. 1 , pp. 401 - 407 ). Madrid, Spain: Spanish Research Council (CSIC).
Ke , Q. , Ferrara , E. , Radicchi , F. , & Flammini , A. ( 2015 ). Defining and identifying sleeping beauties in science . Proceedings of the National Academy of Sciences , 112 ( 24 ), 7426 - 7431 . https://doi.org/10. 1073/pnas.1424329112.
Kline , R. B. ( 2004 ). Beyond significance testing: Reforming data analysis methods in behavioral research . Washington, DC: American Psychological Association.
Vanclay , J. K. ( 2013 ). Factors affecting citation rates in environmental science . Journal of Informetrics , 7 ( 2 ), 265 - 271 . https://doi.org/10.1016/j.joi. 2012 . 11 .009.
Vinkler , P. ( 2010 ). The evaluation of research by scientometric indicators . Oxford: Chandos Publishing.
Waltman , L. ( 2016 ). A review of the literature on citation impact indicators . Journal of Informetrics , 10 ( 2 ), 365 - 391 .
Waltman , L., van Eck , N., van Leeuwen , T. , Visser , M. , & van Raan , A. ( 2011a ). Towards a new crown indicator: An empirical analysis . Scientometrics , 87 ( 3 ), 467 - 481 . https://doi.org/10.1007/s11192-011- 0354-5.
Waltman , L., van Eck , N. J., van Leeuwen, T. N. , Visser , M. S. , & van Raan , A. F. J. ( 2011b ). Towards a new crown indicator: Some theoretical considerations . Journal of Informetrics , 5 ( 1 ), 37 - 47 . https://doi. org/10.1016/j.joi. 2010 . 08 .001.
Wang , J. ( 2013 ). Citation time window choice for research impact evaluation . Scientometrics , 94 ( 3 ), 851 - 872 . https://doi.org/10.1007/s11192-012-0775-9.
Webster , G. D. , Jonason , P. K. , & Schember , T. O. ( 2009 ). Hot topics and popular papers in evolutionary psychology: Analyses of title words and citation counts in Evolution and Human Behavior, 1979 - 2008 . Evolutionary Psychology, 7 ( 3 ), 348 - 362 .
Wesel , M. , Wyatt , S. , & Haaf , J. ( 2013 ). What a difference a colon makes: How superficial factors influence subsequent citation . Scientometrics . https://doi.org/10.1007/s11192-013-1154-x.
Ye , F. Y. , & Bornmann , L. ( 2018 ). ''Smart Girls' ' versus ''Sleeping Beauties'' in the sciences: The identification of instant and delayed recognition by using the citation angle . Journal of the Association of Information Science and Technology , 69 ( 3 ), 359 - 367 .
Yu , T. , Yu , G. , Li , P.-Y., & Wang , L. ( 2014 ). Citation impact prediction for scientific papers using stepwise regression analysis . Scientometrics , 101 ( 2 ), 1233 - 1252 . https://doi.org/10.1007/s11192-014-1279-6.