Magnitude of reward in selective learning (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.3758%2FBF03343269.pdf

Magnitude of reward in selective learning

Magnitude of reward in selective learning' JOHN H. MORRISON AND JOHN J. PORTER UNIVERSITY OF WISCONSIN-MILWAUKEE Ab8tract The growth of incentive motivation was investigated in a two choice discrimination using three groups of 10 Ss for 300 trials. The experimental group was trained with 20 mg and 97 mg rewards. One control group received 20 mg while the other control received 97 mg on all trials . Asymptotic running speed was not achieved, but the converging speeds of the control groups tended to support the "habit" hypothesis. Both a negative contrast effect, where the experimental Ss ran slower to 20 mg than did the controls, and a positive contrast effect, where the experimental Ss ran faster to 97 mg than did the controls, was observed. Problem For Hull (1952) and Spence (1956), incentive motivation (K) has been assumed to be a function of the classically conditioned consummatory habit (rg). It follows that a large reward leads to more rapid growth of rg (and thus K) than a small reward, since, on each trial, the consummatory response is practiced more often when ingesting a large reward. However, when the rg habit to the small reward reaches asymptote,performance should be the same to both small and large reward. Thus, the rate of performance increase depends on reward magnitude, while the asymptote of performance does not (Spence, 1956, 140-141) . Both Hull and Spence considered the alternative that different rewards might produce different rgs, rather than different rates of learning the same r Spence (1956, 142-148) assumed that the "vigor' of the consummatory response, rather than the amount of practice, determined incentive motivation (K) . Thus, the "vigor" hypothesis, in contrast to the "habit" hypothesis, predicts that the asymptote of performance is a function of the magnitude of reward. Most studies of magnitude of reward seem to favor the "vigor" hypothesis over the "habit" hypothesis (cf.: Festinger, 1943; Davenport,1964).However, the "habit" hypothesis has not really been tested since none of the relevant studies have used enough trials to assure asymptotic performance. Selective learning studies contrasting differential reward magnitudes have suggested that running speed is not a simple function of magnitude of reward, but is rather a function of the difference between two rewards. This "contrast effect" has been observed by Bower (1961), Porter (1964), Clayton (1965), and Spear & Hill (1965). The present study attempted to answer two questions about contrast effects and the "vigor" and "habit" hypotheses in selective learning: With other factors held constant does magnitude of reward determine r Psychon. Sci. , 196:i , Vol. 3 response speed and choice behavior, or are reward effects temporary, with amount of practice of the consummatory response determining final asymptotic performance? Are constant effects temporary or permanent obstructions which mask the simple effects of reinforcement? .~ ... hod Thirty naive female hooded rats, ISO days old when training began, were used. Ss were run in a wedge-shaped choice chamber, similar to that used by Ramond (1954). which was 12-in deep and IS-in wide at the goal end. The chamber had retractable response levers on either side. illuminated by a shielded 7-watt bulb such that each lever was illuminated by the light above It. On free choice trials both lights were on and both levers were in the chamber. On forced trials, only one lever was in the chamber and one light was on. Ss were timed from the opening of the start box glass door until S actuated a contact relay by touching the bar. The Ss were assigned to three groupsoflO each. one experimental group and two control groups. The experimental group received a 97 mg Noyes pellet to the large reward side and a 20 mg pellet to the small reward side. Both control groups received equal reward to both sides of the chamber, the small reward group 20 mg and the large reward group 97 mg. Onehalfofthe Ss in the 97-20 group were assigned their large reward to their preferred side to balance poSition preferences. For five weeks before the experiment began, Ss were allowed 1 hr. free access to food after 23 hr. deprivation. During this period. each S was llandled for 5 min. daily. For three days prior to the actual training. Sa were allowed 5 min. exploration of the apparatus. All Sa received four such exposures daily. On balf of the exploratory trials a pellet was placed on the left side of the chamber, the remainder of the time on the right. The experimental Ss received 97 mg on half of these trials and 20 mg on the remaining trials. Although the levers were not in the chamber during this phase of training, both 7-watt bulbs were on. Following pretraining, Sa received six trials a day for four days, 12 trials a day for two days, and IS trials a day thereafter . The Ss were run one group at a time. After a block of six trials Ss were returned to their home cages. In each block of six trials the sequence of choice and forced trials was CCFFFF. !tesponse speeds were measured on the last two forced trials in each block of six. On the IS trials per day schedule Ss were forced twice to the large reward on the fifth trial of one day and forced twice to the large reward on the sixth trial the next day. Re8.U8 The mean speeds of the control groups and of the experimental group to both 97 mg and 20 mg appear in Fig. 1, plottted in 5 blocks of 6 trials to e,a ch discriminanda. The points on the curves represent mean speeds after 60, 120, 180, 240, and 300 trials. Comparing the speeds of the experimental SS to 20 mg with the speeds of the two control groups demonstrated that performance depended upon both magnitude of reward, F(2,27) =4.69, p< .05, and number of trials, F(4,108) = 74.95, p< .001. A significant interaction between these variables, F(8 ,108) = 6.95, p< .001, demonstrated the changing relationship between the speeds of the experimental and control groups over trials, and prompted comparisons of the mean speeds between :i31 () l.U (j) 1.5 A-A97E "- 0 1.2 D- .9 l.U l.U (j) .······.97C O-- ...... • l.U (j) z 0 6 / (j) .3 D- l.U 0:: 0 020y 6-------620E ____ ••••••••••••••••••• •• ' .------A ••••••• ..............•11 ._---::::::-:::::::::-0- .O .............. -,,~-·· ~~::~;:.;::~:~~~:~.,/ 2 3 4 5 BLOCKS OF 60 TRIALS Fig. 1. Mean speeds after each block of 60 trial for the two control groups (C) and the experimental grou p (El. Each point on the figure represents the mean of 20 forced trials (10 per side) for the control groups, and 10 forced trials to the 97 mg side and 10 forced trials to the 20 mg side for the experimental groups. groups at different stages of training. The critical difference (CD) for all comparisons (Cochran & Cox, 1950) was calculated: CD(26,107) = 0.195, p = .05. After 180 trials the mean speed of the experimental Ss to 20 mg was significantly less than the speed of the 20-20 Ss, p = 0.271. After 2 (...truncated)