This article is available in:
HTML
PDF
The distribution of rewards in both variable-ratio and random-ratio schedules is examined with specific reference to gambling
behaviour. In particular, it is the number of early wins and unreinforced trials that is suggested to be of importance in
these schedules, rather than the often-reported average frequency of wins. Gaming machine data are provided to demonstrate
the importance of early wins and unreinforced trials. Additionally, the implication of these distributional properties for
betting strategies and the gambler's fallacy is discussed. Finally, the role of early wins and unreinforced trials is considered
for gambling research that utilises simulated gaming machines and research that compares concurrent schedules of reinforcement.
Turner and Horbay (2004) provided a comprehensive review of the underlying mechanisms governing electronic gaming machine (EGM) play. Their review
addressed many of the misconceptions about the design of gaming machines, and although it was intended for counsellors, prevention
workers in the field of problem gambling, and the general public, it is also of use to those studying gambling behaviours
in experimental settings with simulated slot machines (e.g., Dixon, MacLin, & Daugherty, 2006; Weatherly & Brandt, 2004; Weatherly, Sauter, & King, 2004; Zlomke & Dixon, 2006).
The current paper extends some of the issues raised by Turner and Horbay (2004) with specific reference to random-ratio (RR) schedules and the role of early wins and unreinforced trials. First, the difference
between variable-ratio (VR) and RR schedules of reinforcement is discussed in terms of the number of early wins and the number
of unreinforced trials that each schedule provides. It is argued that these properties of gaming machine reinforcement have
implications for the gambler's fallacy, schedule-induced behaviours, and research using simulated gaming machines.
Second, the notion of early wins and unreinforced trials in RR schedules receives greater scrutiny in this paper. Just as
Turner and Horbay (2004) examined the misconceptions among gamblers regarding the law of averages and win/loss expectations, this paper examines the
misconception among researchers regarding average reinforcement rate and experimental control. Gaming machine data are provided
to illustrate the importance of the distribution of early wins and unreinforced trials.
A number of early gambling researchers referred to gaming machines as operating under a variable ratio of reinforcement (Cornish, 1978), and, even today, the slot machine is typically provided as an example of a VR schedule to undergraduate psychology students
(e.g., Weiten, 2007). It has since been documented that gaming machines operate under a more complex RR schedule of reinforcement (Crossman, 1983; Hurlburt, Knapp, & Knowles, 1980; Turner & Horbay, 2004), utilising pseudo-random number generators; Turner and Horbay (2004) debunked many of the myths associated with randomness in slot machine play. However, the difference between a VR and an RR
schedule of reinforcement has not been illuminated previously with reference to gambling behaviour.
A variable ratio of 2.5 indicates that, on average, every 2.5 responses will be rewarded. When this type of VR schedule is
designed, it is done with a determined number of reinforced responses, for example, 1, 2, 3, and 4, arranged in a variable
order to form the VR sequence. The VR schedule comprises a number of different sized fixed-ratio schedules (Crossman, 1983). Behaviourally, this means that the maximum number of responses before reinforcement will be four, the minimum one. If the
VR schedule is activated repeatedly and randomly, this will result in an indefinitely long sequence of digits (not digits
of an indefinite size), which may serve as the run lengths on a VR schedule with an average run length of approximately 2.5.
With enough trials, around one quarter of all runs should be of length 1, one quarter of length 2, one quarter of length 3,
and one quarter of length 4.
With a random ratio of 2.5, the sequence will contain run lengths with a mean of 2.5, but the run lengths themselves can range
from 1 to an indefinitely large number. Thus, whilst both types can be described by an average sequence of run lengths, the
distribution of run lengths for these two will be greatly different. In gaming machine play, this difference has implications
for both cognitive (the gambler's fallacy) and learning (schedules of reinforcement) explanations of persistent gaming behaviour.
Under the VR schedule outlined above, the probability of a reinforcer on the next response increases with every unrewarded
response (Crossman, 1983). That is, the first response has a 0.25 chance of being rewarded, and if no reward is provided, then the next response has
a 0.33 chance of being rewarded, the next has a 0.50 chance, and the last has a 1.00 chance. Thus, the maximum number of unreinforced
responses is three, and if this sequence occurs then there is a 100% probability that the next response will be rewarded.
This is because a VR schedule is designed with a predetermined number of reinforced response lengths: in this example, they
are 1, 2, 3, and 4. With adequate exposure to these conditions the gaming machine player could rationally expect a win after
a loss and develop a reasonable strategy of increasing the stake size to increase the impending reward. The development of
this type of strategy is considered the basis for the principle of the gambler's fallacy (Ayton & Fischer, 2004; Ladouceur, 2004) when applied to RR schedules; however, the probabilities indicate that it is not a fallacy under a VR schedule.
Furthermore, after a response has been rewarded, the probability of the next response (recommencement of play) being rewarded
is 0.25 Therefore, the probability of it not being rewarded is 0.75. Thus, if the experience of play has been that after a
win another win occurs only 25% of the time, or that no win occurs 75% of the time, the behaviour of the player is likely
to reflect this. The player may adjust the size of their bet based on the probability of a win or loss.
Under an RR schedule, each response-outcome is independent of the previous one because there is a constant probability of
payoff for each trial (Crossman, 1983; Hurlburt et al., 1980). All EGMs operate under an RR schedule, and the size of this probability is determined in a more complex manner by a random
number generator (see Turner & Horbay, 2004, for a more detailed explanation of the modern EGM configuration). EGMs are also very volatile, and the response-outcome
relationship is influenced by secondary machine characteristics such as the multiplier potential, the pay structure, “free”
games, near misses, and linked jackpots (Griffiths, 1993). These can all promote irrational beliefs about winning, and, under an RR schedule, the gambler's fallacy does exist, because
the distribution of wins for an RR schedule differs from that of a VR schedule.
It is worth noting that some studies have assessed the rate of responding and postreinforcement pauses on EGMs in relation
to wins and losses (Delfabbro & Winefield, 1999; Dickerson, Hinchy, Legg England, Fabre, & Cunningham, 1992; Schreiber & Dixon, 2001) and have generally found a pattern of play on slot machines that is very similar to that found on VR schedules.
The only published study comparing human gaming behaviour under both a VR and an RR schedule is Hurlburt et al. (1980). Their study involved 20 undergraduate students playing a computer-simulated game in a laboratory setting and gambling bogus
money. Their dependent variables were schedule preference, measured by the number of bets made, and strategy employment, measured
by the amount staked per gamble (with increasing stake size indicating the player believed a win was imminent). The aim of
the Hurlburt et al. study was to determine if participants preferred a VR schedule to an RR schedule and whether participants
employed a betting strategy on a VR schedule but not an RR schedule. The results suggested no behavioural differences between
the schedules, although the support for the null hypothesis may be explained by poor ecological validity and statistical power
problems. The study utilised an unrealistic teletype simulation for the slot machine (Dixon et al., 2006) with a small number of trials, and the power of the statistical test chosen was adequate to detect very large effect sizes
only.
Hurlburt et al. (1980) noted other explanations for the support of the null hypothesis. They suggested that the manner in which the participants
were introduced to the schedules might have played a critical role, as “[s]haping is apparently more likely than verbal instructions
to lead to differential responding” (p. 638). Thus, the behavioural significance of the distributional difference between
the variable ratio and the random ratio may become more apparent over a greater number of trials, as learning of the distributional
properties of the VR schedule may take some time. Other work on schedules has also suggested that exposure levels may explain
sensitivity to schedules (Weatherly & Brandt, 2006).
There is still uncertainty regarding the behavioural differences between a VR and an RR schedule. Empirically, this could
have an impact on the use of computer-simulated gaming devices based on VR schedules or where the schedule is unknown. However,
a computerised slot machine has been devised by MacLin, Dixon, & Hayes (1999) which operates under an RR schedule (Zlomke & Dixon, 2006) and allows researchers to manipulate a number of key variables. Several published studies have since utilised this freely
available software (Dixon et al., 2006; Schreiber & Dixon, 2001; Weatherly et al., 2004; Weatherly & Brandt, 2004; Zlomke & Dixon, 2006) to test cognitive and learning explanations of gambling behaviour. However, researchers using actual slots or computer-simulated
versions need to be aware of the distributional properties of RR schedules in order to ensure control across participants
and machines. In particular, it is argued below that the important consideration is, again, the distribution of early wins
and unreinforced trials.
Another problem with the Hurlburt et al. (1980) study was that the difference in the distribution of reinforcement between the VR and RR schedules was not illustrated. Under
a VR schedule, with sufficient trials, the distribution of reinforcement should be graphically represented as a straight line.
This reflects the fact that the frequency of wins occurring after one response is the same as the frequency of wins occurring
after two, three, or four responses. However, under an RR schedule, the distribution of reinforcement is very different. With
a random ratio of 2.5 a win may occur after 100 responses (which is impossible under a variable ratio of 2.5), but this skews
the average rate to a higher figure (the effect an outlier has on the mean). Therefore, under an RR schedule, the majority
of reinforcers occur more frequently, which compensates for the effect of any outlier and provides the lower mean.
This is shown in the figures below. Figure 1 shows the distribution of wins under a VR schedule (variable ratio 2.5) and Figure 2 displays the results of 856 bets placed by the author on a real slot machine in a gaming venue, providing an RR schedule
(random ratio 2.56).
Figures 1 and 2 illustrate the difference in reinforcer rates between a VR schedule and an RR schedule. Both have a similar mean reinforcement
ratio, but the distribution of reinforcers is considerably different. It is also evident that the RR schedule possesses a
mode of reinforcement, which is more frequent than the mean reinforcement rate. Figure 2 shows that over 35% of first button presses are reinforced, compared to only 25% for the VR schedule. Also, the number of
unreinforced trials is vastly different between the two distributions. Just how this difference is reflected in gambling behaviour
is unclear, but it is possible that regular players become sensitive to the number of early wins and/or the number of unreinforced
trials and operate according to these values. Certainly, both of these would appear easier to detect in gaming machine play
than the average reinforcement rate.
If players are aware of these characteristics, then it is these characteristics that must be reported when testing the effect
of schedules on playing behaviour. The study by Hurlburt et al. (1980) controlled for the average rate of reinforcement and reported that participants failed to discriminate. This may be because
the numbers of early wins and unreinforced trials were similar in their examples. Similarly, other studies comparing schedules
also only report the average (Dixon et al., 2006; Schreiber & Dixon, 2001) and assume control has been achieved if the two distributions have the same average. Just as Turner and Horbay (2004) illustrated how the average reinforcement figure can mislead gamblers about the nature of the reinforcement distribution,
the average reinforcement figure can also mislead the researcher into believing that control has been achieved.
When testing concurrent schedules, it may be that the number of early wins and unreinforced trials needs to be similar to
properly ensure control across the schedules. Zlomke and Dixon (2006) provided an excellent example of the experimental rigour needed when testing concurrent schedules. Using the simulated game
from MacLin et al. (1999), they compared machines that differed only in colour by controlling for possible variations in reinforcement density. This
resulted in an identical sequence of trial outcomes, thereby ensuring control.
With different machines having different RR distributions in gaming venues, these characteristics may affect machine selection
and machine persistence. It is possible that players have a preference for schedules based on the number of early wins and
the number of unreinforced trials. There is some support that small frequent wins are preferred by players (Dixon et al., 2006; Griffiths, 1999) and also support for the general concept that the placement of wins in a gambling cycle can influence gambling behaviour
such as persistence (Weatherly et al., 2004).
Delfabbro and Winefield (2000) and Walker (1992) linked persistent gaming machine play to irrational thoughts generated by beliefs about gaming machine reinforcement schedules.
Sharpe (2002) extended upon this point and cited Vitaro, Arsenault, and Tremblay's (1999) finding that impulsive individuals tend to prefer immediate reinforcement. She concluded that the placement of wins early
in the gaming experience (i.e., a big win when first gambling) and the patterns of wins and losses within gaming sessions
“may have etiological significance in the development of problematic levels of gambling in vulnerable individuals” (p. 8).
She developed a comprehensive model of problem gambling that included win/loss patterns and cognitive biases.
Another effect of RR schedules on the way gaming machines are played is bet size. On North American and Australian slot machines,
the number of lines played is determined by the player with each line being purchased, and this has the tradeoff of increasing
the frequency of reinforcement. Figures 3 and 4 show the distribution of reinforcers when playing 10 lines and when playing 20 lines on the same slot machine. The same number
of bets was placed on each (n = 428).
Figures 3 and 4 clearly show that increasing the number of pay lines from 10 to 20 increases the mean reinforcer rate from 1 in 3.00 to 1
in 2.20. However, of greater interest is the fact that the run of unreinforced trials was longer when playing 10 lines (maximum
= 12) compared to 20 lines (maximum = 7). The most frequently occurring number of reinforced trials was one under both conditions,
but the percentage of trials rewarded after one response was higher when playing 20 lines (45%) compared to 10 lines (28%).
Perhaps it is this increase in the number of early wins, and the decrease in the length of unreinforced trials, that influences
player betting strategies and the decision to continue gambling. By purchasing more lines to play on a slot machine, a player
can increase the frequency of reinforcement and reduce the number of unreinforced trials. This could promote the player's
belief that they can control the betting outcomes (e.g., “If I buy more lines I get more wins and fewer losses”), which is
true regarding the frequency of (small) wins, but actually leads to an increase in the rate of net loss. Empirical investigation
of this is needed with regard to the illusion of control and possible chasing behaviour due to increased rates of losses.
It is worth noting that increasing the number of lines played increases the amount staked and that a machine's maximum stake
limit has been shown as a characteristic that influences time and money spent gambling, along with other behaviours such as
cigarette and alcohol consumption (Sharpe, Walker, Coughlan, Enersen, & Blaszczynski, 2005). Hence, early wins and unreinforced trials are perhaps the components of the RR schedule that need to be manipulated and
reported in studies of the effect of schedules of reinforcement on gaming machine behaviour.
The current paper provides an important extension to Turner and Horbay's (2004) review of EGM design. This extension is of most benefit to gaming machine researchers because there is a need for awareness
of the differences between RR and VR schedules. This has methodological implications for research and is important for the
appropriate evaluation of research in this field. In particular, gaming machine researchers should be aware of the difference
in the distribution of reinforcement between the two types of schedules. This will have an impact on the use of simulated
gaming devices in research and the generalisation of behaviour under a VR schedule to RR schedules. Moreover, there is a need
for research to report the frequency of early wins and the length of unreinforced trials in the RR distribution, rather than
assume that two distributions are identical based on the average reinforcement rate. To date, the theoretical and behavioural
significance of early wins and unreinforced trials has not been examined within the gaming machine context; however, there
does appear to be some relationship with the gambler's fallacy, the illusion of control, and the role that reinforcement has
on persistent gaming behaviour.
Copyright © 2020 | Centre for Addiction and Mental Health
Journal Information
Journal ID (publisher-id): jgi
ISSN: 1910-7595
Publisher: Centre for Addiction and Mental Health
Article Information
© 1999-2008 The Centre for Addiction and Mental Health
Received Day: 9 Month: April Year: 2007
Accepted Day: 24 Month: August Year: 2007
Publication date: June 2008
First Page: 56 Last Page: 67
Publisher Id: jgi.2008.21.6
DOI: 10.4309/jgi.2008.21.6
Random-ratio schedules of reinforcement: The role of early wins and unreinforced trials
School of Psychology, University of Western Sydney, New South Wales, Australia. E-mail: john.haw@uws.edu.au
For correspondence: John Haw, School of Psychology, University of Western Sydney, Locked Bag 1797, South Penrith Distribution
Centre, NSW 1797, Australia. E-mail: john.haw@uws.edu.au
All URLs were active at the time of submission. This article was peer-reviewed.
Competing interests: None declared.
Ethics approval: Not required.
Funding: None.
John Haw (PhD) is a lecturer of psychological research methods in the School of Psychology, University of Western Sydney,
Australia. He completed his PhD thesis, “An operant analysis of gaming machine play”, in 2000 and has conducted further research
on the psychological predictors of problem gambling. He has also consulted with industry and government on various gambling
issues and has supervised PhD students in the area of cognitive/behavioural explanations of gambling behaviours.
Abstract
References
Ayton, P.. Fischer, I.. ( 2004). The hot hand fallacy and the gambler's fallacy. Two faces of subjective randomness. Memory and Cognition, 32, 1369–1378.
Cornish, D.B.. ( 1978). Gambling: A review of the literature and its implications for policy and research. London: Her Majesty's Stationery Office.
Crossman, E.. ( 1983). Las Vegas knows better. The Behavior Analyst, 6, 109–110.
Delfabbro, P.H.. Winefield, A.H.. ( 1999). Poker machine gambling: An analysis of within session characteristics. British Journal of Psychology, 90, 425–439.
Delfabbro, P.H.. Winefield, A.H.. ( 2000). Predictors of irrational thinking in regular slot machine gamblers. The Journal of Psychology, 134, 117–128.
Dickerson, M.G.. Hinchy, J.. Legg England, S.. Fabre, J.. Cunningham, R.. ( 1992). On the determinants of persistent gambling: 1. High frequency poker machine players. British Journal of Psychology, 83, 237–248.
Dixon, M.R.. MacLin, O.H.. Daugherty, D.. ( 2006). An evaluation of response allocations to concurrently available slot machine simulations. Behavior Research Methods, 38, 232–236.
Griffiths, M.D.. ( 1993). Fruit machine gambling: The importance of structural characteristics. Journal of Gambling Studies, 9, 101–120.
Griffiths, M.D.. ( 1999). Gambling technologies: Prospects for problem gambling. Journal of Gambling Studies, 15, 265–283.
Hurlburt, R.T.. Knapp, T.J.. Knowles, S.H.. ( 1980). Simulated slot-machine play with concurrent variable and random-ratio of schedules of reinforcement. Psychological Reports, 47, 635–639.
Ladouceur, R.. ( 2004). Perceptions among pathological and nonpathological gamblers. Addictive Behaviors, 29, 555–565.
MacLin, O.H.. Dixon, M.R.. Hayes, L.J.. ( 1999). A computerised slot machine simulation to investigate the variables involved in gambling behavior. Behavior Research Methods, Instruments & Computers, 31, 731–734.
Schreiber, J.. Dixon, M.R.. ( 2001). Temporal characteristics of slot machine play in recreational gamblers. Psychological Reports, 89, 67–72.
Sharpe, L.. ( 2002). A reformulated cognitive-behavioral model of problem gambling: A biopsychosocial perspective. Clinical Psychology Review, 22, 1–25.
Sharpe, L.. Walker, M.. Coughlan, M.. Enerson, K.. Blaszczynski, A.. ( 2005). Structural changes to electronic gaming machines as effective harm minimization strategies for non-problem and problem gamblers. Journal of Gambling Studies, 21, 503–520.
Turner, N.. Horbay, R.. ( 2004). How do slot machines and other electronic gambling machines actually work?Journal of Gambling Issues, 11. Retrieved September 23, 2006, from http://www.camh.net/egambling/issue11/jgi_11_turner_horbay.html
Vitaro, F.. Arseneault, L.. Tremblay, R.E.. ( 1999). Impulsivity predicts problem gambling in low SES adolescent males. Addiction, 94, 565–575.
Walker, M.B.. ( 1992). Irrational thinking among slot machine players. Journal of Gambling Studies, 8, 245–288.
Weatherly, J.N.. Brandt, A.E.. ( 2004). Participants' sensitivity to percentage payback and credit value when playing a slot machine simulation. Behavior and Social Issues, 13, 33–50.
Weatherly, J.N.. Sauter, J.M.. King, B.M.. ( 2004). The “big win” and resistance to extinction when gambling. The Journal of Psychology, 138, 495–504.
Weiten, W.. ( 2007). Psychology: Themes and variations (7th ed.). Belmont, CA: Thomson Wadsworth.
Zlomke, K.R.. Dixon, M.R.. ( 2006). Modification of slot-machine preferences through the use of a conditional discrimination paradigm. Journal of Applied Behavior Analysis, 39, 351–361.
Figures
Keywords:
Keywords
schedules of reinforcement, random ratio, gambling behaviour.
Related Article(s):
Editor-in-chief: Nigel E. Turner, Ph.D.
Managing Editor: Vivien Rekkas, Ph.D. (contact)
Introduction
Variable ratios, random ratios, and the gambler's fallacy
Early wins and unreinforced trials
Conclusion