PatPho: A phonological pattern generator for neural networks (pdf)

Article PDF cannot be displayed. You can download it here:

http://link.springer.com/content/pdf/10.3758%2FBF03195469.pdf

PatPho: A phonological pattern generator for neural networks

BRIAN MACWHINNEY 0 1 0 Carnegie Mellon University , Pittsburgh, Pennsylvania 1 PING LI University of Richmond , Richmond, Virginia Much of the power of neural network modeling for language use and acquisition derives from a reliance on statistical regularities implicit in the phonological properties of words. Researchers have devised several methods for representing the phonology of words, but these methods are often either unable to represent realistically sized lexicons or inadequate in the ways they represent individual words. In this paper, we present a new phonological pattern generator (PatPho) that allows connectionist modelers to derive accurate phonological representations of the English lexicon. PatPho not only generates phonological patterns that can scale up to realistically sized lexicons, but also accurately and parsimoniously captures the similarity structures of the phonology of monosyllabic and multisyllabic words. - Rumelhart and McClellands (1986) connectionistmodel of the acquisition of the English past tense had a profound positive impact on the fields of artificial neural networks, language acquisition, and cognitive psychology. However, that model was also heavily criticized for the way it represented phonological patterns of the verbal input. The fundamental structure of the past tense learning model was a nonstandard phonological structure called the Wickelfeature. Critics (Lachter & Bever, 1988; Pinker & Prince, 1988) argued that these distributed feature structures were unable to faithfully represent the phonological structures of words and the differences between words. As a result of these problems, connectionist researchers subsequently abandoned the use of Wickelfeatures as a way to represent phonological input, using instead a variety of alternative systems for phonological representations. These methods fall roughly into three categories. The first class of methods (e.g., Plunkett & Marchman, 1991, 1993) treats the word as a simple string of phonemes. For example, Plunkett and Marchman used 6 binary units to code each of the three positions in a set of English consonantvowelconsonant (CVC), VCC, and CCV wordlike strings. Features included voicing, sonority, and place and manner of articulation. Because each of the three seg This research was supported by Grants BCS-9975249 and BCS998009 from the National Science Foundation. We thank Xiaoming Zhao and Lihua Chen, who assisted in the development of the source code, and Igor Farkas for helping with the binary coding and conducting the PCA analyses. Please address correspondence to P. Li, Department of Psychology, University of Richmond, Richmond, VA 23173 (e-mail: ). ments used six features, 18 units were needed to code a three-phoneme word. A representation of this type provides only an approximation to the phonology of words, owing to its use of arbitrarily determined binary values for phonological features. In addition, the representation accommodates only a limited number of monosyllables. Because of these problems, it is not a good choice for simulations that attempt to model the learning of a realistic lexicon. Miikkulainen (1997) used a variant of this scheme with five units on a continuous scale to represent the phonological features of each English phoneme. In his scheme, a word is a simple concatenationof its component phonemes. This extended representation scheme can accommodate words beyond monosyllables. It also provides a more accurate representation of the phonological features, because of its use of continuous units instead of binary units. However, it has problems capturing the similarity between words of different phonemic lengths. For example, spot and pot in this coding will end up sharing very little similarity, because phonemic concatenation leads to dislocated positioning of similar phonemes: For spot, Units 15 represent /s/ and Units 610 represent /p/, whereas for pot, Units 15 represent /p/ and Units 610 represent //, and so on. Thus, the same phoneme activates completely different units in the representation (see Plaut, McClelland, Seidenberg, & Patterson, 1996, for a discussion of a similar problem in orthographic representations). A second method for representing phonological patterns encodes no more than a single segment at a time. For example, the NetTalk system (Sejnowski & Rosenberg, 1988) uses a read-head approach to processing, which accepts single English orthographic letters one by one and then outputs the corresponding English sounds. To do this, the system maintains a local memory of the context. This form of representation is unable to capture larger phonological patterns and cannot deal with word-based irregularities or nonlocal phonological patterns. A third method for representing phonological patterns relies on the slot-based representation introduced by MacWhinney and Leinbach (1991) and applied in a variety of later models (Joanisse & Seidenberg, 1999; Plaut et al., 1996; Plunkett & Juola, 1999). MacWhinney and Leinbach showed how the switch from Wickelfeatures to slot-based representations solved many of the problems with Rumelhart and McClellands (1986) model of past tense learning. By using slot-based representations, the phonology of a word is encoded in terms of a template with a fixed set of slots, rather than as a string with either a fixed or a variable length or as a series of isolated segments. This method has its basis in autosegmental phonological theory, according to which phonemes are bundles of features in metric syllabic grids (Goldsmith, 1976; Levelt, 1989). Each segment in a word is assigned to a different slot, depending on which syllable it belongs to and whether it appears in the syllables onset, nucleus, or coda. For a monosyllabic word, it is relatively simple to assign phonemes to their appropriate positions. For example, Joanisse and Seidenberg used the CCVVCCC template to represent English monosyllables, in which a consonant initial would occur in the first C position and consonant clusters would occupy the first two CC positions; single vowels occur at the first V position, but diphthongs occupy both VVs, and so on. Plunkett and Juola used a CCCVVCCC template, which could additionally accommodate consonant clusters such as /str/ at the word-initial position. Thus, in this type of coding, spot could occur as spCoVtCC in the template, whereas pot could occur as CpCoVtCC, thus preserving their phonological similarities. The representations used by Joanisse and Seidenberg (1999) and by Plunkett and Juola (1999) are restricted to monosyllables. MacWhinney and Leinbach (1991) also used slots to represent multisyllabic English verbs. For example, a full trisyllabic template in MacWhinney and Leinbachs representation had a CCCVVCCCVVCCCVVCCC form. Recently, Bullinaria (1997) presented a model that combined the slot-based representation with aspects of the single-segment processing used in NetTalk. However, it appears (...truncated)