Journal ID (publisher-id): jgi
Publisher: Centre for Addiction and Mental Health
© 1999-2008 The Centre for Addiction and Mental Health
Received Day: 9 Month: April Year: 2007
Accepted Day: 24 Month: August Year: 2007
Publication date: June 2008
First Page: 56 Last Page: 67
Publisher Id: jgi.2008.21.6
|Random-ratio schedules of reinforcement: The role of early wins and unreinforced trials|
|School of Psychology, University of Western Sydney, New South Wales, Australia. E-mail: firstname.lastname@example.org
For correspondence: John Haw, School of Psychology, University of Western Sydney, Locked Bag 1797, South Penrith Distribution Centre, NSW 1797, Australia. E-mail: email@example.com
All URLs were active at the time of submission. This article was peer-reviewed.
Competing interests: None declared.
Ethics approval: Not required.
John Haw (PhD) is a lecturer of psychological research methods in the School of Psychology, University of Western Sydney, Australia. He completed his PhD thesis, “An operant analysis of gaming machine play”, in 2000 and has conducted further research on the psychological predictors of problem gambling. He has also consulted with industry and government on various gambling issues and has supervised PhD students in the area of cognitive/behavioural explanations of gambling behaviours.
The distribution of rewards in both variable-ratio and random-ratio schedules is examined with specific reference to gambling behaviour. In particular, it is the number of early wins and unreinforced trials that is suggested to be of importance in these schedules, rather than the often-reported average frequency of wins. Gaming machine data are provided to demonstrate the importance of early wins and unreinforced trials. Additionally, the implication of these distributional properties for betting strategies and the gambler's fallacy is discussed. Finally, the role of early wins and unreinforced trials is considered for gambling research that utilises simulated gaming machines and research that compares concurrent schedules of reinforcement.
Turner and Horbay (2004) provided a comprehensive review of the underlying mechanisms governing electronic gaming machine (EGM) play. Their review addressed many of the misconceptions about the design of gaming machines, and although it was intended for counsellors, prevention workers in the field of problem gambling, and the general public, it is also of use to those studying gambling behaviours in experimental settings with simulated slot machines (e.g., Dixon, MacLin, & Daugherty, 2006; Weatherly & Brandt, 2004; Weatherly, Sauter, & King, 2004; Zlomke & Dixon, 2006).
The current paper extends some of the issues raised by Turner and Horbay (2004) with specific reference to random-ratio (RR) schedules and the role of early wins and unreinforced trials. First, the difference between variable-ratio (VR) and RR schedules of reinforcement is discussed in terms of the number of early wins and the number of unreinforced trials that each schedule provides. It is argued that these properties of gaming machine reinforcement have implications for the gambler's fallacy, schedule-induced behaviours, and research using simulated gaming machines.
Second, the notion of early wins and unreinforced trials in RR schedules receives greater scrutiny in this paper. Just as Turner and Horbay (2004) examined the misconceptions among gamblers regarding the law of averages and win/loss expectations, this paper examines the misconception among researchers regarding average reinforcement rate and experimental control. Gaming machine data are provided to illustrate the importance of the distribution of early wins and unreinforced trials.
A number of early gambling researchers referred to gaming machines as operating under a variable ratio of reinforcement (Cornish, 1978), and, even today, the slot machine is typically provided as an example of a VR schedule to undergraduate psychology students (e.g., Weiten, 2007). It has since been documented that gaming machines operate under a more complex RR schedule of reinforcement (Crossman, 1983; Hurlburt, Knapp, & Knowles, 1980; Turner & Horbay, 2004), utilising pseudo-random number generators; Turner and Horbay (2004) debunked many of the myths associated with randomness in slot machine play. However, the difference between a VR and an RR schedule of reinforcement has not been illuminated previously with reference to gambling behaviour.
A variable ratio of 2.5 indicates that, on average, every 2.5 responses will be rewarded. When this type of VR schedule is designed, it is done with a determined number of reinforced responses, for example, 1, 2, 3, and 4, arranged in a variable order to form the VR sequence. The VR schedule comprises a number of different sized fixed-ratio schedules (Crossman, 1983). Behaviourally, this means that the maximum number of responses before reinforcement will be four, the minimum one. If the VR schedule is activated repeatedly and randomly, this will result in an indefinitely long sequence of digits (not digits of an indefinite size), which may serve as the run lengths on a VR schedule with an average run length of approximately 2.5. With enough trials, around one quarter of all runs should be of length 1, one quarter of length 2, one quarter of length 3, and one quarter of length 4.
With a random ratio of 2.5, the sequence will contain run lengths with a mean of 2.5, but the run lengths themselves can range from 1 to an indefinitely large number. Thus, whilst both types can be described by an average sequence of run lengths, the distribution of run lengths for these two will be greatly different. In gaming machine play, this difference has implications for both cognitive (the gambler's fallacy) and learning (schedules of reinforcement) explanations of persistent gaming behaviour.
Under the VR schedule outlined above, the probability of a reinforcer on the next response increases with every unrewarded response (Crossman, 1983). That is, the first response has a 0.25 chance of being rewarded, and if no reward is provided, then the next response has a 0.33 chance of being rewarded, the next has a 0.50 chance, and the last has a 1.00 chance. Thus, the maximum number of unreinforced responses is three, and if this sequence occurs then there is a 100% probability that the next response will be rewarded. This is because a VR schedule is designed with a predetermined number of reinforced response lengths: in this example, they are 1, 2, 3, and 4. With adequate exposure to these conditions the gaming machine player could rationally expect a win after a loss and develop a reasonable strategy of increasing the stake size to increase the impending reward. The development of this type of strategy is considered the basis for the principle of the gambler's fallacy (Ayton & Fischer, 2004; Ladouceur, 2004) when applied to RR schedules; however, the probabilities indicate that it is not a fallacy under a VR schedule.
Furthermore, after a response has been rewarded, the probability of the next response (recommencement of play) being rewarded is 0.25 Therefore, the probability of it not being rewarded is 0.75. Thus, if the experience of play has been that after a win another win occurs only 25% of the time, or that no win occurs 75% of the time, the behaviour of the player is likely to reflect this. The player may adjust the size of their bet based on the probability of a win or loss.
Under an RR schedule, each response-outcome is independent of the previous one because there is a constant probability of payoff for each trial (Crossman, 1983; Hurlburt et al., 1980). All EGMs operate under an RR schedule, and the size of this probability is determined in a more complex manner by a random number generator (see Turner & Horbay, 2004, for a more detailed explanation of the modern EGM configuration). EGMs are also very volatile, and the response-outcome relationship is influenced by secondary machine characteristics such as the multiplier potential, the pay structure, “free” games, near misses, and linked jackpots (Griffiths, 1993). These can all promote irrational beliefs about winning, and, under an RR schedule, the gambler's fallacy does exist, because the distribution of wins for an RR schedule differs from that of a VR schedule.
It is worth noting that some studies have assessed the rate of responding and postreinforcement pauses on EGMs in relation to wins and losses (Delfabbro & Winefield, 1999; Dickerson, Hinchy, Legg England, Fabre, & Cunningham, 1992; Schreiber & Dixon, 2001) and have generally found a pattern of play on slot machines that is very similar to that found on VR schedules.
The only published study comparing human gaming behaviour under both a VR and an RR schedule is Hurlburt et al. (1980). Their study involved 20 undergraduate students playing a computer-simulated game in a laboratory setting and gambling bogus money. Their dependent variables were schedule preference, measured by the number of bets made, and strategy employment, measured by the amount staked per gamble (with increasing stake size indicating the player believed a win was imminent). The aim of the Hurlburt et al. study was to determine if participants preferred a VR schedule to an RR schedule and whether participants employed a betting strategy on a VR schedule but not an RR schedule. The results suggested no behavioural differences between the schedules, although the support for the null hypothesis may be explained by poor ecological validity and statistical power problems. The study utilised an unrealistic teletype simulation for the slot machine (Dixon et al., 2006) with a small number of trials, and the power of the statistical test chosen was adequate to detect very large effect sizes only.
Hurlburt et al. (1980) noted other explanations for the support of the null hypothesis. They suggested that the manner in which the participants were introduced to the schedules might have played a critical role, as “[s]haping is apparently more likely than verbal instructions to lead to differential responding” (p. 638). Thus, the behavioural significance of the distributional difference between the variable ratio and the random ratio may become more apparent over a greater number of trials, as learning of the distributional properties of the VR schedule may take some time. Other work on schedules has also suggested that exposure levels may explain sensitivity to schedules (Weatherly & Brandt, 2006).
There is still uncertainty regarding the behavioural differences between a VR and an RR schedule. Empirically, this could have an impact on the use of computer-simulated gaming devices based on VR schedules or where the schedule is unknown. However, a computerised slot machine has been devised by MacLin, Dixon, & Hayes (1999) which operates under an RR schedule (Zlomke & Dixon, 2006) and allows researchers to manipulate a number of key variables. Several published studies have since utilised this freely available software (Dixon et al., 2006; Schreiber & Dixon, 2001; Weatherly et al., 2004; Weatherly & Brandt, 2004; Zlomke & Dixon, 2006) to test cognitive and learning explanations of gambling behaviour. However, researchers using actual slots or computer-simulated versions need to be aware of the distributional properties of RR schedules in order to ensure control across participants and machines. In particular, it is argued below that the important consideration is, again, the distribution of early wins and unreinforced trials.
Another problem with the Hurlburt et al. (1980) study was that the difference in the distribution of reinforcement between the VR and RR schedules was not illustrated. Under a VR schedule, with sufficient trials, the distribution of reinforcement should be graphically represented as a straight line. This reflects the fact that the frequency of wins occurring after one response is the same as the frequency of wins occurring after two, three, or four responses. However, under an RR schedule, the distribution of reinforcement is very different. With a random ratio of 2.5 a win may occur after 100 responses (which is impossible under a variable ratio of 2.5), but this skews the average rate to a higher figure (the effect an outlier has on the mean). Therefore, under an RR schedule, the majority of reinforcers occur more frequently, which compensates for the effect of any outlier and provides the lower mean.
This is shown in the figures below. Figure 1 shows the distribution of wins under a VR schedule (variable ratio 2.5) and Figure 2 displays the results of 856 bets placed by the author on a real slot machine in a gaming venue, providing an RR schedule (random ratio 2.56).
Figures 1 and 2 illustrate the difference in reinforcer rates between a VR schedule and an RR schedule. Both have a similar mean reinforcement ratio, but the distribution of reinforcers is considerably different. It is also evident that the RR schedule possesses a mode of reinforcement, which is more frequent than the mean reinforcement rate. Figure 2 shows that over 35% of first button presses are reinforced, compared to only 25% for the VR schedule. Also, the number of unreinforced trials is vastly different between the two distributions. Just how this difference is reflected in gambling behaviour is unclear, but it is possible that regular players become sensitive to the number of early wins and/or the number of unreinforced trials and operate according to these values. Certainly, both of these would appear easier to detect in gaming machine play than the average reinforcement rate.
If players are aware of these characteristics, then it is these characteristics that must be reported when testing the effect of schedules on playing behaviour. The study by Hurlburt et al. (1980) controlled for the average rate of reinforcement and reported that participants failed to discriminate. This may be because the numbers of early wins and unreinforced trials were similar in their examples. Similarly, other studies comparing schedules also only report the average (Dixon et al., 2006; Schreiber & Dixon, 2001) and assume control has been achieved if the two distributions have the same average. Just as Turner and Horbay (2004) illustrated how the average reinforcement figure can mislead gamblers about the nature of the reinforcement distribution, the average reinforcement figure can also mislead the researcher into believing that control has been achieved.
When testing concurrent schedules, it may be that the number of early wins and unreinforced trials needs to be similar to properly ensure control across the schedules. Zlomke and Dixon (2006) provided an excellent example of the experimental rigour needed when testing concurrent schedules. Using the simulated game from MacLin et al. (1999), they compared machines that differed only in colour by controlling for possible variations in reinforcement density. This resulted in an identical sequence of trial outcomes, thereby ensuring control.
With different machines having different RR distributions in gaming venues, these characteristics may affect machine selection and machine persistence. It is possible that players have a preference for schedules based on the number of early wins and the number of unreinforced trials. There is some support that small frequent wins are preferred by players (Dixon et al., 2006; Griffiths, 1999) and also support for the general concept that the placement of wins in a gambling cycle can influence gambling behaviour such as persistence (Weatherly et al., 2004).
Delfabbro and Winefield (2000) and Walker (1992) linked persistent gaming machine play to irrational thoughts generated by beliefs about gaming machine reinforcement schedules. Sharpe (2002) extended upon this point and cited Vitaro, Arsenault, and Tremblay's (1999) finding that impulsive individuals tend to prefer immediate reinforcement. She concluded that the placement of wins early in the gaming experience (i.e., a big win when first gambling) and the patterns of wins and losses within gaming sessions “may have etiological significance in the development of problematic levels of gambling in vulnerable individuals” (p. 8). She developed a comprehensive model of problem gambling that included win/loss patterns and cognitive biases.
Another effect of RR schedules on the way gaming machines are played is bet size. On North American and Australian slot machines, the number of lines played is determined by the player with each line being purchased, and this has the tradeoff of increasing the frequency of reinforcement. Figures 3 and 4 show the distribution of reinforcers when playing 10 lines and when playing 20 lines on the same slot machine. The same number of bets was placed on each (n = 428).
Figures 3 and 4 clearly show that increasing the number of pay lines from 10 to 20 increases the mean reinforcer rate from 1 in 3.00 to 1 in 2.20. However, of greater interest is the fact that the run of unreinforced trials was longer when playing 10 lines (maximum = 12) compared to 20 lines (maximum = 7). The most frequently occurring number of reinforced trials was one under both conditions, but the percentage of trials rewarded after one response was higher when playing 20 lines (45%) compared to 10 lines (28%). Perhaps it is this increase in the number of early wins, and the decrease in the length of unreinforced trials, that influences player betting strategies and the decision to continue gambling. By purchasing more lines to play on a slot machine, a player can increase the frequency of reinforcement and reduce the number of unreinforced trials. This could promote the player's belief that they can control the betting outcomes (e.g., “If I buy more lines I get more wins and fewer losses”), which is true regarding the frequency of (small) wins, but actually leads to an increase in the rate of net loss. Empirical investigation of this is needed with regard to the illusion of control and possible chasing behaviour due to increased rates of losses. It is worth noting that increasing the number of lines played increases the amount staked and that a machine's maximum stake limit has been shown as a characteristic that influences time and money spent gambling, along with other behaviours such as cigarette and alcohol consumption (Sharpe, Walker, Coughlan, Enersen, & Blaszczynski, 2005). Hence, early wins and unreinforced trials are perhaps the components of the RR schedule that need to be manipulated and reported in studies of the effect of schedules of reinforcement on gaming machine behaviour.
The current paper provides an important extension to Turner and Horbay's (2004) review of EGM design. This extension is of most benefit to gaming machine researchers because there is a need for awareness of the differences between RR and VR schedules. This has methodological implications for research and is important for the appropriate evaluation of research in this field. In particular, gaming machine researchers should be aware of the difference in the distribution of reinforcement between the two types of schedules. This will have an impact on the use of simulated gaming devices in research and the generalisation of behaviour under a VR schedule to RR schedules. Moreover, there is a need for research to report the frequency of early wins and the length of unreinforced trials in the RR distribution, rather than assume that two distributions are identical based on the average reinforcement rate. To date, the theoretical and behavioural significance of early wins and unreinforced trials has not been examined within the gaming machine context; however, there does appear to be some relationship with the gambler's fallacy, the illusion of control, and the role that reinforcement has on persistent gaming behaviour.
|Ayton, P.. Fischer, I.. ( 2004). The hot hand fallacy and the gambler's fallacy. Two faces of subjective randomness. Memory and Cognition, 32, 1369–1378.|
|Cornish, D.B.. ( 1978). Gambling: A review of the literature and its implications for policy and research. London: Her Majesty's Stationery Office.|
|Crossman, E.. ( 1983). Las Vegas knows better. The Behavior Analyst, 6, 109–110.|
|Delfabbro, P.H.. Winefield, A.H.. ( 1999). Poker machine gambling: An analysis of within session characteristics. British Journal of Psychology, 90, 425–439.|
|Delfabbro, P.H.. Winefield, A.H.. ( 2000). Predictors of irrational thinking in regular slot machine gamblers. The Journal of Psychology, 134, 117–128.|
|Dickerson, M.G.. Hinchy, J.. Legg England, S.. Fabre, J.. Cunningham, R.. ( 1992). On the determinants of persistent gambling: 1. High frequency poker machine players. British Journal of Psychology, 83, 237–248.|
|Dixon, M.R.. MacLin, O.H.. Daugherty, D.. ( 2006). An evaluation of response allocations to concurrently available slot machine simulations. Behavior Research Methods, 38, 232–236.|
|Griffiths, M.D.. ( 1993). Fruit machine gambling: The importance of structural characteristics. Journal of Gambling Studies, 9, 101–120.|
|Griffiths, M.D.. ( 1999). Gambling technologies: Prospects for problem gambling. Journal of Gambling Studies, 15, 265–283.|
|Hurlburt, R.T.. Knapp, T.J.. Knowles, S.H.. ( 1980). Simulated slot-machine play with concurrent variable and random-ratio of schedules of reinforcement. Psychological Reports, 47, 635–639.|
|Ladouceur, R.. ( 2004). Perceptions among pathological and nonpathological gamblers. Addictive Behaviors, 29, 555–565.|
|MacLin, O.H.. Dixon, M.R.. Hayes, L.J.. ( 1999). A computerised slot machine simulation to investigate the variables involved in gambling behavior. Behavior Research Methods, Instruments & Computers, 31, 731–734.|
|Schreiber, J.. Dixon, M.R.. ( 2001). Temporal characteristics of slot machine play in recreational gamblers. Psychological Reports, 89, 67–72.|
|Sharpe, L.. ( 2002). A reformulated cognitive-behavioral model of problem gambling: A biopsychosocial perspective. Clinical Psychology Review, 22, 1–25.|
|Sharpe, L.. Walker, M.. Coughlan, M.. Enerson, K.. Blaszczynski, A.. ( 2005). Structural changes to electronic gaming machines as effective harm minimization strategies for non-problem and problem gamblers. Journal of Gambling Studies, 21, 503–520.|
|Turner, N.. Horbay, R.. ( 2004). How do slot machines and other electronic gambling machines actually work?Journal of Gambling Issues, 11. Retrieved September 23, 2006, from http://www.camh.net/egambling/issue11/jgi_11_turner_horbay.html|
|Vitaro, F.. Arseneault, L.. Tremblay, R.E.. ( 1999). Impulsivity predicts problem gambling in low SES adolescent males. Addiction, 94, 565–575.|
|Walker, M.B.. ( 1992). Irrational thinking among slot machine players. Journal of Gambling Studies, 8, 245–288.|
|Weatherly, J.N.. Brandt, A.E.. ( 2004). Participants' sensitivity to percentage payback and credit value when playing a slot machine simulation. Behavior and Social Issues, 13, 33–50.|
|Weatherly, J.N.. Sauter, J.M.. King, B.M.. ( 2004). The “big win” and resistance to extinction when gambling. The Journal of Psychology, 138, 495–504.|
|Weiten, W.. ( 2007). Psychology: Themes and variations (7th ed.). Belmont, CA: Thomson Wadsworth.|
|Zlomke, K.R.. Dixon, M.R.. ( 2006). Modification of slot-machine preferences through the use of a conditional discrimination paradigm. Journal of Applied Behavior Analysis, 39, 351–361.|
Distribution of reinforcers under a VR schedule (variable ratio 2.5).
Distribution of reinforcers under an RR schedule (random ratio 2.56).
Distribution of reinforcers when playing 10 pay lines (random ratio 3.0).
Distribution of reinforcers when playing 20 pay lines (random ratio 2.2).
Keywords: Keywords schedules of reinforcement, random ratio, gambling behaviour.