When is recall higher than recognition?
When is recal higher than recognition?
1
ENDEL TULVING
UNIVERSITY OF TORONTO
Sixteen Ss learned a list of 48 paired words (A-B pairs) to
a criterion of two perfect trials and were then tested for the
recognition of B-members of all pairs. Approximately 10% of
the learned words were not correctly identified in the recognition test. These data show that recall is higher than recognition when retrieval cues present at the recall test are more
effective in providing access to stored information than are
retrieval cues present at the recognition test.
Recognition tests of memory usually yield higher
scores of retention than do recall tests. Such superiority of recognition over recall is explained in
terms of two factors: (1) the, differences in the number
of alternatives from among which correct responses
are to be selected (Brown, 1965; Davis, Sutherland, &
Judd, 1961; Slamecka, 1967), and (2) the differences
in the amount and nature of retained information
necessary for the identification and for the unaided
reproduction of learned items (McNulty, 1965). If the
two factors are equated in recall and in recognition,
then the two measures should, and sometimes do,
yield identical estimates of retention.
No extant theoretical formulation of recall and
recognition specifies any conditions under which recall might be superior to recognition, even though
some experiments demonstrating this relationship
have been reported (e.g., Bahrick & Bahrick, 1964;
Lachman & Field, 1965). In these experiments, however, recognition tests of individual items were paced
by E, and S's failure to correctly identify items
learned earlier may have been related to lack of
time for considering the alternatives available to S
on the basis of unaided recall. The present paper
reports a simple experiment showing that Ss sometimes fail to recognize items they can· reproduce
even if they have unlimited time in the recognition test.
An explanation of this phenomenon is also proposed.
Method
Sixteen Ss, undergraduate and graduate psychology
students of both sexes, participated in the experiment.
Each S learned a single list of 48 paired words (A-B
pairs), under the typical paired-associate antiCipation
conditions, and was then tested for the recognition
of B members of the pairs.
Two different lists were used, each with eight Ss.
In both lists, A and B members of all pairs were
common monosyllabic English words. To eliminate
the massive amount of experimental practice necessary for the mastery of a 48-pair list, A-B pairs
were made up of words meaningfully related to each
other. In List I, for instance, pairs such as the following were used: TOOTH-ACHE, AIR-PORT, FLOOR-
Psychon. Sci., 1968, Vol. 10 (2)
SHOW, HOME-STEAD. The A members of pairs in
List 2 were identical with the A members of List I,
but B members were all different. Thus, the pairs
in List 2 corresponding to the examples given from
List 1 were: TOOTH-PICK, AIR-CRAFT, FLOORCLOTH, HOME-SICK.
Each S was tested individually. On the first trial
the prompting method was used. Each pair of words,
hand-printed on a 3 x 5 card, was shown to S, followed
by a card bearing only the A member of the pair.
The S was asked to call out the B member of the pair
when presented with the A member. No S made any
errors on this trial. Beginning with the second trial,
the standard paired-associate method was used. A
given A member was shown, the S asked to name the
B member and then, regardless of the S's response,
both A and B members of the same pair were presented. One trial consisted of the test and presentation of all 48 pairs. The presentation of cards bearing
A members and pairs was paced by S and thus the
rate varied from S to S. On the average, Ss took
approximately 5 min to go through the list on the
second trial (first antiCipation trial), but they speeded
up in the course of practice, and on the fifth trial
the average time per trial had been reduced to approximately 3 min. Two differentordersofpresentation
of pairs were used with each S, but these orders
were different for different Ss.
Paired-associate training was continued with each
S until the S had anticipated all 48 B members correctly on two consecutive trials. The S was then
given a sheet of paper on which were printed, in
alphabetical order, the 96 B members of Lists 1
and 2, and he was asked to check off all the words
he had just learned. Thus, all 16 Ss took the same
recognition test, with one-half of the items "old"
for one group and the other half old for the other
group. This procedure eliminated any possible effects
of response bias or guessing on the recognition scores.
Unlimited time was allowed for this test.
Results
The data were pooled over both groups (i.e., both
lists), since there were no apparent differences in
the results of the two groups. The number of trials
required to reach the criterion (excluding the first
prompting trial, but including the two criterion trials)
ranged from 5 to 9, the mean for 16 Ss being 7.2.
Every S reached the criterion of two perfect trials
immediately following his first perfect trial. It is a
safe conclusion, therefore, that had the Ss been given
another paced paired-associate test trial, their performance would have been perfect on that trial.
53
The numbers of correct identifications of old items
as old ranged from 36 to 47, with a mean of 43.4. Thus,
on the average, Ss failed to recognize 4.6 items
among the 48 that they had been able to recall in
the presence of A words. The mean number of incorrect identifications of new items as old was .87, and
the median was zero.
Discussion
The results of this experiment clearly show that it
is possible for Ss to recall-that is, to reproduce
from memory-learned verbal units even if they cannot
identify these units as old items in a recognition test.
Thus, under the conditions of the present experiment,
recall is superior to recognition. While the generality
of these results-with respect to factors such as the
nature of the material, length of the list, amount of
practice given prior to the recognition and recall
tests, and the like-remains to be determined, some
relevant features of the conditions of the experiment
must be identified for the purpose of the interpretation of the results.
The recall test was one involving aided or cued
recall. The Ss reproduced each B item in the presence
of a specific retrieval cue, the corresponding A item.
Noncued recall of B items certainly would have been
considerably lower than the obtained cued recall (cf.,
Tulving & Pearlstone, 1965), and also lower than the
observed recognition performance. On the other hand,
it is an equally safe conclusion that aided or cued
recognition-recognition of a B item in the presence
of its corresponding A item-would have been at least
as high as cued recall. Recall cannot be higher than
recognition as long as retrieval cues are identical,
but it can be higher if retrieval cues are different
in the two test situations.
The result (...truncated)