Reward magnitude shifts: A savings effect (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.3758/BF03342150.pdf

Reward magnitude shifts: A savings effect

Reward magnitude shifts: A savings effect' ALLAN R. WAGNER AND EARL THOMAS YALE UNIVERSITY Two groups of rats were first trained to traverse a runway, one with large and the other with small food rewards. Both groups were then given additional training with small rewards until their performance equalized. In final training, both groups were shifted to the larger rewards. The group with a prior history of large rewards, evidenc ed savings in the final stage of training, in terms of a faster rate of performance change to the level appropriate to the larger reward. Under certain conditions the magnitude of reward variable can be shown to have relatively persistent effects upon the runway performance of the rat. Thus, Ss trained with large, as compared to small, rewards when subjected to experimental extinction will continue to evidence their different reward histories over the course of extinction (e.g., Wagner, 1961; Hulse, 1958). Under other conditions, however, reward magnitude appears to have a less persistent influence. Thus, Ss trained with large rewards and then shifted to smaller rewards will, after a relatively small number of postshift trials, come to respond at speeds indistinguishable from those of Ss trained only with small rewards (e.g., Crespi, 1944; Zeaman,1949). A question which this poses is whether shifting to a new, but nonzero, value of reward is an especially effective method of eliminating any influences of prior reward magnitudes, or whether relatively persistent historical influences exist but are simply not observed in performance under these conditions? The results of the present investigation of shifts in reward magnitude support the latter interpretation. Method The Ss were 40 male albino rats, 90 to 110 days of age at the beginning of the experiment. They were randomly assigned to two groups of 20 Ss each. The apparatus was a straight, enclosed alley 3 in wide and 4 in high throughout, divided by guillotine doors into a 12-in start box, a 45-in runway, and a 15-in goal box. A I-in deep food cup was attached to the end wall of the goal box with its rim 1.5 in from the floor. Total-Response time was measured from the opening of the start box door until the interruption of a photocell beam just prior to the rim of the food cup, and was fractionated by means of intermediate photocells located 12 in and 48 in from the start box, into three successive segments, start-time, running-time, and goal box-time. Ten days prior to experimental training, Ss were placed on a 24-hour food deprivation schedule and brought to between 75% and 80% of their ad lib, 100-day weight. The Ss were maintained within this range throughout the experiment by limited feedings 30 min. Psychon. Sci., 1966, Vol. 4 after daily training. During the course of these 10 days, Ss were habituated to the experimental situation by systematic handling, exposure to wet mash pellets scattered on a raised platform, exposure to the runway in groups of three, and finally, two direct placements in the goal box of the alley with· a wet mash pellet (.5 gm dry weight) in the food cup. Experimental training for all Ss consisted of one alley trial per day for 88 days. A trial was begun by placing S in the start box. When S oriented to the start door, the door was raised, allowing S to traverse the alley. On each trial a wet mash pellet, pressed from a mixture of 2 parts (by weight) Purina Lab Chow powder and 1 part water, was available in the food cup. The S was removed from the goal box immediately after eating and returned to its home cage. The treatment of the two groups differed only in the amount of reward received during the first of three stages of training. Group LSL received 1.0 gm (dry weight) rewards during the first 43 trials (Stage I). Group SSL received .1 gm rewards during the same period. During the next 16 days (Stage II) both groups received .1 gm reward, and during the last 29 days (Stage III), 1.0 gm reward. Thus in the three stages, Group LSL received first large, then small, and finally returned to large reward, whereas Group SSL received small reward during the first two stages and first encountered the larger reward during Stage III. Results and Discussion Figure 1 presents mean Total-Response speeds for the two groups during the three stages of training. During Stage I, when Group LSL received 1.0 gm reward as I 3.00 :: I ~ 1.."..- ~ 2.50 'e;" ~ I /', 2.00 ,r'-.d'/ ~ II ~ ~ ::' I z J c( ~ 0.50 / /. r----G"'f I '~-..." \ GROUP SSt / I I I STAGE I I STAGE n STAGE m , TRIALS Fig. 1. Mean Total-Response speeds for groups of 20-8s receiving a sequence of either large, small, large (LSL) or small, small, large (SSL) rewards in three successive stages of runway training. Changed reward size when appropriate according to these sequences was introduced at the termination of trials 44 and 60, after the completion of the response measure reported. 13 compared to .1 gm for Group SSL, the former group consistently ran faster than the latter. The difference in mean Total-Response speeds on the last four trials in Stage I was highly reliable, t(38) = 3.40, p< .001. The response speed of the two groups equalized after 4 trials of Stage II, when both received the same small reward. The speeds remained similar until after the beginning of Stage III, when 1.0 gm rewards were resumed for Group LSL and introduced for the first time for Group SSL. Although both groups received the same magnitude of reward during this stage, the initiation of Stage III was quickly followed by a reseparation of the performance of two groups. On the first block of 4 Stage III trials following the reintroduction of large reward (i.e. trials 61-64) Group LSL attained a mean speed of running which was maintained with but small fluctuations over the remainder of Stage III. In comparison, Group SSL showed a more gradual increase in running speed, before reaching the same level as Group LSL. A comparison of the increase in running speed on trials 61-64 over the preceding block of 4 trials prior to the receipt of the larger reward yielded a significantly greater change for Group LSL than for Group SSL, t(38) = 2.43, P < .02. While the Total-Response measure presents a clear picture of savings, Group LSL returning faster to a level of performance appropriate to the larger rewards than Group SSL, this effectwasnotuniversallyobserved in the several component measures. Starting speeds revealed no differences between the groups in stage III, and goal box speeds showed the largest separation of the two groups. That is, the effect became more pronounced as Ss behavior was measured nearer to the goal area. According to Hull-Spence theory (e.g., Spence, 1956) different magnitudes of reward may be presumed to lead to the conditioning of anticipatory reward responses (r g - Sg) of different vigors. When Ss are shifted from large to small rewards it may be assumed that a new and less vigor (...truncated)