Evolutionary Game Design-rxs.pdf

(1110 KB) Pobierz
639975166 UNPDF
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 2, NO. 1, MARCH 2010
1
Evolutionary Game Design
Cameron Browne and Frederic Maire , Member, IEEE
ABSTRACT— It is easy to create new combinatorial games but more
difficult to predict those that will interest human players. We ex-
amine the concept of game quality, its automated measurement
through self-play simulations, and its use in the evolutionary search
for new high-quality games. A general game system called Ludi is
described and experiments conducted to test its ability to synthe-
size and evaluate new games. Results demonstrate the validity of
the approach through the automated creation of novel, interesting,
and publishable games.
Pell [1, p. 10] states:
If we could develop a program which, upon considera-
tion of a particular game, declared the game to be uninter-
esting, this would seem to be a true sign of intelligence!
So when this becomes an issue, we will know that the field
has certainly matured.
With this in mind, we describe a framework called Ludi for
the measurement and synthesis of combinatorial games, and
three experiments designed to test the system and the validity
of the following hypotheses.
I. There exist fundamental (and measurable) indicators of
game quality.
II. These fundamental indicators may be harnessed for the
directed search for new high-quality games.
INDEX TERMS— Aesthetics, artificial intelligence (AI), combinato-
rial game, evolutionary search, game design.
I. I NTRODUCTION
W HILE games research has led to many important arti-
ficial intelligence (AI) breakthroughs over the last few
decades, these have generally come through the study of classics
such as Chess , Go , and Checkers , and used as their yardstick for
success the strength of the artificial player. Little attention has
been paid to measuring the quality of the games themselves or
asking such questions as follows.
• What makes a game interesting to play?
• Can we tell if a game is likely to become a classic?
The emergence of the games industry as a commercial
phenomenon makes these questions increasingly important, as
record numbers of designers produce record numbers of games
each year, but it can take years of play testing and commercial
enquiry to determine whether a game is likely to succeed.
A tool for automatically measuring the quality of a given
game could be of significant benefit to both game designers
and the industry. It could reduce development time by quickly
detecting flaws, and reduce the need for extensive play testing
which can require designers to prematurely reveal their proto-
types to external sources. Further, such a tool could direct the
automated search for new rule combinations, with a view to sug-
gesting interesting avenues for designers to pursue or even to
producing complete new games of meaningful quality.
II. D EFINING G AMES
Of the many ways to define a game, Salen and Zimmerman
make the following useful observation: A game is a system in
which players engage in an artificial conflict, defined by rules,
that results in a quantifiable outcome [2]. This definition was
condensed from the findings of many prior studies, most of
which identified the following key elements:
• rules;
• play;
• outcome.
A game may therefore be expressed in terms of its means ,
play , and ends . These three aspects are central to the design
of Ludi and its underlying model of game analysis, and will
provide a recurring theme throughout this paper.
A. Combinatorial Games
We focus on combinatorial games , which are:
finite : produce a well-defined outcome;
discrete : turn-based;
deterministic : chance plays no part;
perfect information : no hidden information;
two-player .
The two-player requirement is debatable as solitaire puzzles
may constitute combinatorial games, in the sense that the puzzle
solver competes against the null player and indirectly the de-
signer who set the challenge. Multiplayer games with three or
more players fall outside the scope of combinatorial play due to
the social aspect of coalitions that may arise.
The term game will henceforth refer to a two-player combi-
natorial game throughout this paper. Such games are an ideal
test bed for the experiments as they are typically deep but de-
scribed by simple, well-defined rule sets.
Note that this is not a work in combinatorial game theory
(CGT), which is concerned with the analysis of games with a
Manuscript received September 04, 2009; manuscript revised November 27,
2009. Date of manuscript acceptance January 25, 2010; date of publication Feb-
ruary 02, 2010; date of current version March 17, 2010. This work was supported
in part by an Australian Postgraduate Award and a small grant from a Games
Research scheme conducted by Microsoft Research Asia.
C. Browne is with the Computational Creativity Group, Department of Com-
puting, Imperial College London, London SW7 2AZ, U.K. (e-mail: cameron.
browne@imperial.ac.uk).
F. Maire is with the Faculty of Science and Technology, Queensland Univer-
sity of Technology, Brisbane, Qld. 4000, Australia (e-mail: f.maire@qut.edu.
au).
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org.
Digital Object Identifier 10.1109/TCIAIG.2010.2041928
1943-068X/$26.00 © 2010 IEEE
AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:51:36 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.
2
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 2, NO. 1, MARCH 2010
C. Recombination Games
Fig. 1. Games of: (a) Tic-Tac-Toe and (b) Tic-Tac-Toe (3-D) won by White.
Given a game in its ludemic form, it is a simple matter to
manipulate its rules to create variants and new games. For Tic-
Tac-Toe , such modifications might include the board size
(size 2 2)
or the target line length
(win (in-a-row 2))
However, a moment’s reflection will reveal that each of these
changes breaks the game, by making it unwinnable in the first
case and trivially winnable in the second.
Other manipulations might involve extending the board to
three dimensions, as shown in Fig. 1(b)
(size 3 3 3)
or inverting the end condition to give a misere version
(lose (in-a-row 3))
These variants are both more interesting but still trivially
solvable, and are more notable for their novelty value than any
inherent value as games. There is much room for improvement
in this branch of the -in-a-row family.
The difficulty of deriving an interesting game from
Tic-Tac-Toe does not just stem from the fact that it is it-
self flawed (it is drawish if played correctly). There is the
serious problem that rule sets for combinatorial games tend
to be highly optimized and fragile; authors strive for the sim-
plest rule sets that give the deepest playing experience, and
the slightest change will generally break a game. As in most
creative fields, it is easy to generate artificial content but much
more difficult to generate artificial content of human expert
quality.
Given that the rule sets of most existing games are highly opti-
mized—certainly the well-known ones—it is unlikely that such
simple manipulations of a game’s degrees of freedom will pro-
duce a better game in isolation. The designer would usually have
tested such obvious variants and discarded them as inferior. In-
stead, a more promising approach is to recombine the game’s
rules with rules from other games and look for the emergence
[8] of interesting, new rule combinations not previously consid-
ered. The idea that there preexist a multitude of games in the
form of optimal rule combinations waiting to be discovered res-
onates strongly with the Platonist view of mathematics [9]. The
question then becomes how to search this potentially huge de-
sign space effectively.
view to solving them or at least finding optimal strategies [3] and
developing artificial players able to challenge human experts.
Within the context of this study, the artificial player is of little
interest except as a means for providing self-play simulations.
While it must be of sufficient strength to provide meaningful
playouts, we are concerned primarily with the quality of the
game itself rather than the quality of the player.
B. Ludemes
Just as a meme is a unit of information that replicates from
one person to another [4], a ludeme is a game meme or unit of
game information. First coined by Borvo [5], this term describes
a fundamental unit of play often equivalent to a rule; ludemes
are the conceptual equivalent of a game’s components—both
material and nonmaterial—and are notable for their ability to
pass from one game or game class to another [6].
Ludemes may be single units of information, such as the fol-
lowing items that describe aspects of the game board shown in
Fig. 1(a):
(tiling square)
(size 3 3)
Conceptually related items may be encapsulated to form
higher level compound ludemes as follows:
(board
(tiling square)
(size 3 3)
)
Collecting rules into such compound ludemes is a convenient
way to describe games. For example, the essence of Tic-Tac-Toe
may be succinctly described as follows (assuming a two-player
combinatorial model):
(game Tic-Tac-Toe
(board
(tiling square)
(size 3 3)
D. Game Distance
)
(win (in-a-row 3))
It can be useful to measure the distance between existing
games and a newly derived rule set, in order to determine
whether it constitutes a duplicate, variant, or completely new
game.
The distinction between a variant and a new game is subtle,
but may be achieved by representing both games as rule trees
(based on their ludemic descriptions introduced above) and
accumulating the total weighted difference between these two
trees. Differences between rule parameters are weighted lightly
whereas structural differences between the rules themselves are
weighted more heavily, in inverse proportion to their depth of
nesting; higher level rules generally have wider applicability
)
The concept of an entire game as an item of information may
seem odd but it is valid; there exist many examples of iden-
tical games being discovered, fully formed, at similar times. The
most famous case is the independent discovery of Hex by math-
ematicians Piet Hein and John Nash in the 1940s [7]. A more
recent example is Chameleon, discovered by New Zealand and
USA designers within a week of each other in 2003. Such cases
may be examples of “memetic convergence” in action towards
optimal designs.
AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:51:36 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.
639975166.049.png 639975166.050.png
BROWNE AND MAIRE: EVOLUTIONARY GAME DESIGN
3
Fig. 2. Framework of the Ludi system.
Fig. 3. Basic game model.
and are therefore generally more important. If the total dif-
ference between the two rule sets exceeds a certain threshold
value then the two games are considered to be distinct.
The most widely used GDL is probably the commercially
available Zillions of Games ZRF rule language [12]. ZRF au-
thors define games in a Lisp-like syntax using predefined key-
words, and may programmatically create complex rule struc-
tures through macros. More recently, the Stanford GDL used
for the AAAI GGP competitions [13] is a lower level language
that defines games using first-order logic.
E. General Game Players
Given the possibility of creating a large number of rule sets, it
would be desirable to test them automatically through self-play.
General game players (GGPs)—software systems for playing
a range of games well rather than any one particular game ex-
pertly—are ideal for this purpose.
GGPs were first proposed several decades ago [10] but have
recently enjoyed a resurgence of interest as researchers come
to realize their potential value to the gaming and broader AI
communities. This includes GGP competitions run over recent
years in conjunction with international AI conferences [11].
III. T HE L UDI S YSTEM
Ludi is a system for playing, measuring, and synthesizing
games within the scope of its GDL (Fig. 2). The main compo-
nents of the system are:
GDL : defines the scope of games;
GGP : interprets games and coordinates play;
strategy module : informs move planning;
criticism module : measures game quality;
synthesis module : generates new games.
F. Game Description Languages
Central to any GGP is the game description language (GDL)
that defines the scope of games understood by the system. There
is a delicate balance between defining a GDL that is powerful
and extensible enough to encompass a wide range of known and
not-yet-known games, yet also efficient, elegant, and compre-
hensible to human authors.
A. Ludi GDL
The Ludi GDL is a high-level game description language
based on the ludemic understanding of games outlined in
Section II. It is structured to follow the basic means-play-ends
model of games, extended to include the relationship between
the game and its players (Fig. 3).
AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:51:36 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.
639975166.051.png
4
IEEE TRANSACTIONS ON COMPUTATIONAL INTELLIGENCE AND AI IN GAMES, VOL. 2, NO. 1, MARCH 2010
The Ludi GDL was devised with Kernighan and Pike’s prin-
ciples of good software design [14] in mind:
• simplicity;
• clarity;
• generality;
• automation.
It is a higher level language than the Stanford GDL and Zil-
lions ZRF, and although concise and conducive to human au-
thoring and machine manipulation, it lacks the universal gen-
erality of the Stanford GDL in particular. However, its hierar-
chical and well-defined nature makes it ideal for the intended
experiments, as it is much more likely that a structured tree-
based language will evolve sensible rule sets than an unstruc-
tured logic-based one. The Ludi GDL proved sufficiently rich
for this intended purpose that its somewhat limited scope was
not an issue.
The following example conveys the essence of the language:
(game Tic-Tac-Toe
(players White Black)
(board
(tiling square i-nbors)
(size 3 3)
legal destination cells for movable pieces are similarly marked
“ ” when those pieces are clicked on.
The play manager coordinates play for one to eight human
and artificial players, although only two are used for this study.
This includes move scheduling, input handling, ko repetition
testing to avoid infinite loops, all players passing, and so on.
Moves for artificial players are planned using standard alpha
beta adversarial search with move ordering, beam width reduc-
tion, and iterative deepening [16]. Estimated values for nonter-
minal board positions are provided by the Strategy module, as
follows.
C. Strategy Module
The strategy module provides value estimates for given board
positions relative to each player. This is achieved through a set
of advisors working within certain policies.
1) Advisors: Advisors are evaluation functions that represent
some narrow but rational view of the board position [17], and
express whether a player’s position is favorable or unfavorable
within this perspective [18].
Ludi defines 20 such advisors for aspects such as mobility,
proximity to goal, attacking potential, connective potential, and
so on. Each advisor takes as input a board state, end condition,
and player color (Fig. 5) and has the functionality to return both
an estimate of that player’s positional value and whether the end
condition has been achieved, as required.
2) Policies: The contributions from each advisor are com-
bined using a weighted linear function [15] to provide an overall
value
)
(end (All win (in-a-row 3)))
)
This game ( Tic-Tac-Toe ) is played between White and Black
on a 3 3 square grid with orthogonal and diagonal adjacency,
and is won by the player to make a line of three pieces of their
color (if any). Unless otherwise stated, it is assumed that players
take turns placing a piece of their color on an empty board cell
each move.
Ludi GDL definitions closely correspond to a game’s ludemic
description, which is how a human designer would typically
conceptualize the game. A more detailed description of the lan-
guage is given in Appendix I and further examples of games
defined in the GDL can be found in Appendix II.
for board state
(1)
where is a vector of weights and the set of
advisor functions. The weight vector constitutes a policy that
describes the relative importance of each advisor for that game.
3) Policy Optimization: For efficiency, it is important that
only those advisors relevant to the game have nonzero weight.
The system is able to derive a default policy for each game based
upon its rules and to optimize that policy through self-play using
two-membered evolutionary search -ES [19].
Policy optimization threw up some surprising emergent
strategies, such as a small negative weighting on stack height
to disincline repetitive cycles in stacking games, which was
later incorporated into default policies where appropriate. A
null “fight or flight” policy is used for any player with no
specified end conditions, in which pieces are moved towards
enemy pieces to encourage engagement while maximizing their
movement potential to allow escape if necessary.
Although a high level of play cannot be claimed for all
games, this advisor/policy approach proved sufficient for exer-
cising rule sets defined in the GDL and providing meaningful
self-play simulations. All players, human and artificial, are
beginners at any newly created game.
B. Ludi GGP
The core of the Ludi system is its general game player, as
shown in Fig. 2. The Ludi GGP is implemented in C++ and
provides the following functionality:
• rules parser;
• game object;
• user interface;
• play manager.
The rules parser loads and parses games defined in the Ludi
GDL. If a definition is valid according to the grammar, then the
corresponding ludeme tree is constructed and the single game
object initialized. The game object maintains a record of the
current board state and handles tasks such as the generation of
legal moves and testing for terminal conditions.
The user interface (Fig. 4) presents games uniformly and
anonymously so that quality judgments are made on the merits
of the games themselves rather than their visual attractiveness.
The interface provides a plain English translation of the current
rule set and a tutorial mode to help players understand new
games. In tutorial mode, legal placements are marked “ ” and
IV. G AME E VALUATION
The criticism module measures games for quality through
self-play simulations, based on certain aesthetic criteria.
AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:51:36 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.
639975166.052.png 639975166.001.png 639975166.002.png 639975166.003.png 639975166.004.png 639975166.005.png 639975166.006.png 639975166.007.png 639975166.008.png 639975166.009.png 639975166.010.png 639975166.011.png 639975166.012.png 639975166.013.png 639975166.014.png 639975166.015.png 639975166.016.png 639975166.017.png 639975166.018.png 639975166.019.png 639975166.020.png 639975166.021.png 639975166.022.png 639975166.023.png 639975166.024.png 639975166.025.png 639975166.026.png 639975166.027.png 639975166.028.png 639975166.029.png 639975166.030.png 639975166.031.png 639975166.032.png 639975166.033.png 639975166.034.png 639975166.035.png 639975166.036.png 639975166.037.png 639975166.038.png 639975166.039.png 639975166.040.png 639975166.041.png 639975166.042.png 639975166.043.png 639975166.044.png 639975166.045.png 639975166.046.png 639975166.047.png
BROWNE AND MAIRE: EVOLUTIONARY GAME DESIGN
5
Fig. 4. Ludi user interface.
Just as mathematicians strive for beautiful, aesthetically
meaningful abstractions [25], we are concerned not with the
sensual beauty of games but their intellectual appeal; the ele-
gance of the rules, how well they complement each other, and
the quality of the competition they produce. Contrary to Ellis
[26], the absence of flaws is a precondition for beauty.
Fig. 5. Advisor model.
A. Game Quality
The term quality in this context refers to the likelihood that a
given game will be of interest to human players. While players
generally know whether they enjoy a game, few can articulate
the reasons in concrete terms. Thompson [20] took a significant
step towards formalizing such concepts by defining four key
attributes that a game should possess:
depth : games should hold lasting interest;
clarity : their mechanics should not be confusing;
drama : hope of recovery from bad positions;
decisiveness : end quickly once a winner is certain.
Thompson also points out that games may be viewed as se-
quences of logic puzzles that players pose to each other, and
hence a good game should readily yield such puzzle positions.
Key attributes defined by other authors include interestingness
[21], uncertainty [22], interaction [23], and tension [24]. Origi-
nality in design is also of paramount importance.
B. Aesthetic Model
Birkhoff [27] describes the evaluation of visual art using func-
tions of certain aesthetic criteria , an approach later extended to a
complete algorithmic aesthetics system for the design and crit-
icism of aesthetically meaningful visual objects by Stiny and
Gips [28]. Similar principles can be applied to the aesthetic
measurement of games; in this case, the interpretation of an
object (game) will be a set of aesthetic measurements derived
through self-play, and the output of the algorithm will be a single
predicted aesthetic value. Importantly, Stiny and Gips distin-
guish between constructive and evocative modes for the under-
standing of objects within this system. In constructive mode ob-
jects are understood by the rules of their construction, and in
evocative mode they are understood by the associations, ideas,
or emotions they evoke.
Fig. 6 shows the player-centric aesthetic model of games de-
vised for this study. It is based on the basic game model of
Fig. 3 with the “play” component expanded to distinguish the
players’ strategic plans from their eventual moves. It is these
AUTHORIZED LICENSED USE LIMITED TO: IEEE XPLORE. DOWNLOADED ON MAY 13,2010 AT 11:51:36 UTC FROM IEEE XPLORE. RESTRICTIONS APPLY.
639975166.048.png
Zgłoś jeśli naruszono regulamin