An hypothesis approach to the solution of anagrams
GERALD A. MENDELSOHN
0
0
Institute of Personality Assessment and Research, University of California
,
Berkeley, California 91
,720
The attempts of subjects to reorganize the letters of an anagram were construed as a series of hypotheses about the correct letter order. It was predicted, consequently, that variables which reduce the number of tenable hypotheses or influence the order in which hypotheses are generated. will aff~ct problem difficulty. Five such variables, plus solution word frequency, were used to pr~dlct solu~lOn probabilities in two studies. The multiple Rs obtained were .92 and .82 and the two regression equations were effectively interchangeable. The process of anagram solution was described as entailing the retrieval of words from memory storage on the basis of letter order cues generated by the subject or, less usually, present in the anagram itself.
-
The process of anagram solution can be analyzed into
two phases consisting, respectively, of attempts to
reorganize the letters of the anagram and to retrieve
the solution word from memory. In a recent paper,
Mendelsohn and O'Brien (I974) argued that in the
reorganization phase subjects formulate hypotheses
about the correct letter order on the basis of the em
pirical probabilities of letter events in the language.
Like previous investigators following the lead of
Mayzner and Tresselt (I962; see also, Mayzner, Tresselt,
& Helbock, 1964), they assumed that the bigram is the
basic unit with which the subject works in reorganizing
the anagram, but in their formulation, the pool of bi
grams that can be formed from the anagram constitutes
a limited set of hypotheses from which the subject
samples. The order in which hypotheses are formed
corresponds roughly to the relative transition letter
probabilities (TPs) of the bigrams in the pool, the most
probable appearing first, and so on. The hypotheses
are then tested by attempting to retrieve from memory
a whole word which matches the partial reorganization
of the anagram or by attempting to arrange the remain
ing letters about the hypothesized pair.
It follows from this formulation that, when a solution
word consists of bigrams which are probable relative
to the other (incorrect) bigrams which can be formed
from the anagram, the likelihood of solution should be
high and solution latency should be low. Conversely,
when the bigrams of a solution word are relatively
improbable, high latencies and infrequent solutions
should be the case. The measure developed by
Mendelsohn and O'Brien (I974) to index the relative
frequencies of correct and incorrect bigrams consisted
of the sum, across the correct bigrams, of the number
of incorrect bigrams having higher TPs. It should be
noted that the TPs were obtained from Mayzner and
Tresselt's (1965) frequency counts and, thus, take
word length and letter position into account.
Mendelsohn and O'Brien reported a correlation of about
.75 between this measure and solution scores, thus
confirming their prediction. It was found, further, that
this relationship was uninfluenced by solution word
frequency.
The present paper seeks to extend the hypothesis
approach to anagram solution by examining character
istics of letters and words other than bigram TPs which
may either constrain the number of tenable hypotheses
the subject can formulate or affect the sequence in
which hypotheses are likely to be formulated. Four such
variables were identified as follows:
1. Total number of bigrams with frequency greater
than zero. It should, in general, be the case that the
fewer the tenable hypotheses (i.e., bigrams with fre
quency > 0 which can be formed from the letters of the
anagram), the less the problem difficulty. This variable
is conceptually related to Ronning's (I965) "ruleout"
factor, but is based on the bigram unit rather than on
all letters of the anagram considered at once,
2. Number of vowels in the solution word. With very
few exceptions, words of five or more letters containing
a single vowel do not begin or end with the vowel.
Consequently, in this case, otherwise tenable bigram
hypotheses involving the vowel can be eliminated and
only those in which the vowel occupies an interior
position require consideration. The effect should be to
reduce the number of tenable hypotheses and, thus,
problem difficulty for one-vowel words.
3. First letter of the solution word, vowel or con
sonant. Although there are many words which begin
with vowels, it is far more common for the first letter
to be a consonant. In Mayzner and Tresselt's (I965)
sample of 3,422 five-letter words, for example, 85%
begin with a consonant; likewise, a vowel is less likely
to occupy the first position than any other in five-letter
words. Consequently, hypotheses involving a VC com
bination in Positions 1-2 should occur relatively late
in the sequence of reorganization attempts. Anagrams
formed from solution words beginning with a vowel
should thus be more difficult to solve.
4. Key letters. Cohen (1968) found that anagrams
containing one of the six most infrequent letters in the
language (1, K, Q, V, X, Z) were easier to solve. He
reasoned that "uncommon letters, if present, reduced
the number of letter groupings which were plausible"
and thereby "remove more uncertainty than common
letters" (p.80). The letter "V," for example, the most
frequent of the 6 letters above, can be preceded by
12 different letters and followed by only 6, all vowels,
according to the Underwood and Schulz (1960) count.
In contrast, "B," the next most infrequent letter, can
be preceded by 19 and followed by 17 letters. It is not
infrequency per se which is critical but the limited
number of combinatorial possibilities the letters possess.
Thus, the presence of an infrequent letter provides the
possibility of a maximizing strategy, i.e., if the reorgani
zation effort begins with a bigram including an infre
quent letter, the number of plausible hypotheses can be
rapidly exhausted and in some cases reduced to one or
two. Consequently, independent of the total number
of plausible hypotheses or the relative probabilities of
hypotheses, problem difficulty should be reduced in
the case of a key letter.
Six solution word variables [bigram rank (BR),
word frequency (WF), and the four just described]
were used as predictors of anagram difficulty in two
studies. In addition to zero-order correlations, multiple
correlations were obtained and the results for each
sample cross-validated on the other.
Study 1
The data of the first study were drawn from Mendelsohn
and O'Brien (1974). Since that paper includes a detailed descrip
tion of the selection of solution words, construction of ana
grams and anagram lists, subjects, and procedure, a brief sum
mary can suffice, Thirty five-letter solution words were used,
half of them low frequency (6 to 10) in the Kucera and Francis
(1967) count and half high frequency (35 to 100). There were
five low-BR (5-30), five middle-BR (50-70), and five high-B (...truncated)