Abstract
Collectible card games are taking up more space in popular culture with traditional paper card games even embracing e-sports. However, longevity in such games is not as common, with some suspecting power creep as a culprit behind why some of these games fail. Yet, Magic: the Gathering has not just survived but thrived for over 25 years with the game’s designers publicly stating their aim to keep curbing power creep. Therefore, it is of interest to determine the rate of power creep in the game. Herein, we formally define a conservative metric power creep and calculate its occurrence in the game of Magic: the Gathering. Although having an increasing rate, power creep appears low with an average of 1.56 strictly better card faces released per year.
Introduction
Magic: the Gathering (MtG) is a successful collectable trading card game (TCG) started in 1993 owned by Wizards of the Coast (WotC) that has continued expanding for over 25 years. With over 20 billion cards printed between 2008 and 2016 and well over an estimated 35 million players, MtG – as a game – continues to grow (Duffy, 2015; Webb, 2018; W. of the Coast, (n.d.)). Such is evident with an increasing number of newly designed cards released each year (Figure 1). Yet, with an unprecedented number of card banning in 2020, the question of how this game has remained successful over its long lifetime is worth consideration. Mark Rosewater, head designer at Wizards of the Coast (WotC) since 2003, may have hinted toward MtG’s longevity as avoiding power creep (Rosewater, 2016). Power creep, as its name implies, is the strengthening of the game and its pieces over time possibly to the point where new pieces invalidate older ones. The number of new card faces released each year. This is a subset of the number of faces released each year as most MtG expansions include reprints of previously released cards.
While generally seen through the lens of how power creep effects game play (which may in turn effect who plays the game), Perreault, Daniel, and Tham (Perreault et al., 2021) peer at how power creep may first alter those who play the game. In other words, with constant change of the game, those who play it must play more frequently to “keep up” irrespective of the game play quality (Falcão & Marques, 2019; Perreault et al., 2021; Zuin & Veloso, 2019). Given MtG’s longevity, Perreault, Daniel, and Tham’s perspective is of peculiar interest as MtG players acknowledge the phenomenon that one never quits magic, rather one takes a break (Cordell, 2020; Woods, 2014). A low rate of power creep could result in the game remaining similar enough that returning players recognize it. Similarly, a high rate of power creep—while suggested to increase frequency required of players playing the game—may result in a returning player unable to recognize the game resulting in them failing to return (Ashton & Verbrugge, 2011).
An alternative perspective is that of the new player. Ben Brode, game director of Hearthstone until 2018, acknowledges that producing new content at the very least increases the game complexity which may make the game more daunting to new players who might wish to become enfranchised (Brode, 2015). Further, while it is often the enfranchised player complaining about power creep, game designers wish to keep their games exciting by producing new and powerful cards and mechanics (Ashton & Verbrugge, 2011; Brode, 2015; Stoddard, 2019). It seems, regardless of one’s perspective on power creep, it should be avoided. To that end, MtG’s success in abating power creep has not gone unnoticed by players of other TCGs, such as Yu-Gi-Oh (Williams, 2020).
One of the ways MtG handles power creep is through the concept of rotating formats (Stoddard, 2013a; Rosewater, 2005). These formats use only the latest cards, thereby older “mistakes” are no longer relevant. In addition, it may be easier to keep the power relatively flat when focusing on a limited card pool instead of the over 20,000 cards (although these cards are still taken into account) (Stoddard, 2013a). However, that does not address constructed formats where “mistakes” still exist. Therefore WotC has tried an “Escher Stairwell” approach, where some aspects are lowered in power and others raised (Blogatog, 2012, 2019; Rosewater, 2005).
Power creep is on the minds of players and designers alike, yet how “power” is defined is more nebulous (Brode, 2015; Stoddard, 2013b). Sam Stoddard, senior designer at WotC, makes it clear that power is relative and dependent on the environment, for example, power is different in limited verses constructed formats (Stoddard, 2013b). Further Stoddard declares power creep as relative due to the many formats of MtG. Additionally, he stipulates that two separate cards with the same mana cost, power, and toughness, but different abilities may both be interpreted as “power creep” by players depending on the context regardless of print order (Stoddard, 2013a). It is clear power creep is a topic with much nuance and difficulty to define (Figure 2). A universal metric for the “power” of a card in Magic: the Gathering is difficult to define. Facets of a card’s power may be more easily tabulated. While not every card can be cleanly ordered by their power, there does exist a few strictly better cards (sub 2a). Additionally, functional reprints are an easily identifiable facet of power (sub 2b). Defining the power of a card in a format agnostic way is challenging as the card pool for synergies and number of copies one can run vary (sub 2c and 2d).
With no publicly disclosed metric of power creep, analysis of MtG’s longevity and how that relates to the power creep in the game, discussion is limited solely to feelings about the strength and “health” of a format by players. With a historic number of bannings in 2020, is Mark Rosewater’s premonition of the game collapsing due to top-heaviness coming to pass? (Rosewater, 2016) To investigate the health of the game, we constructed a conservative metric for power creep. As eluded to by many prominent designers from WotC and even Hearthstone, power is relative; accordingly, our definition is based on a relative relation as well. This is achieved by framing power creep in relation to cards that are strictly better than one another.
The concept of associating a score to game pieces from which to rank them is not new (Chen et al., 2018; Fancher, 2015; Karsten, 2015; Zuin and Veloso, 2019; Zuin et al., 2020). While selecting cards by how often they win may identify powerful ones, it may be insufficient to address power creep in a game as a whole (Chen et al., 2018; Fancher, 2015; Karsten, 2015). For example, two popular MtG formats “penny dreadful” and “pauper” are non-rotating formats with restricted card pools (the latter official supported) (Rasmussen, 2019). The cards therein are restricted via price and rarity (common), respectively. Cards with the highest win rate in such formats may be outright banned or “useless” in others. Thus, win-ratio is insufficient to encapsulate power creep for a game as a whole. Zuin and colleagues take a fundamentally different approach by attempting to tabulate the resource cost for a card’s effect (Zuin and Veloso, 2019; Zuin et al., 2020). Conceptually, comparing the resource cost for an effect could allow one to check to see if the cost has changed over time.
While inspiring, unfortunately resource cost alone is insufficient for determining power creep. Consider a “singleton” format (one copy per game piece allowed) versus a non-singleton format (e.g., at most four copies of a game piece per constructed deck). Consider the given that the power of a game piece is dependent on the consistency upon which the player can access the game piece in a given game (which is dependent upon the number of copies of a card that is in one’s deck). An example of such a card is “Relentless Rats,” which scales in power according to the other number of “Relentless Rats” in play. An alternative example, the resource cost of all cards per set could be used to see if the average rate of cards is decreasing. However, looking at only the cost of a card fails to acknowledge cases where effects are getting cheaper and more are being used, while the average of cost per expansion remains constant.
Additionally, one must still identify which cards are even contenders for being improvements over others. A card simply requiring less resource than another is insufficient for determining power creep. What makes Zuin and colleagues work of interest is that since power creep is seemingly an interplay of resource cost and the effect for that cost, utilizing Word2Vec embedding of the rules text to predict the cost opens the opportunity to use the rules text embedding to find which cards should be compared (e.g., via a distance metric to find similar effects) (Mikolov et al., 2013; Zuin and Veloso, 2019). Unfortunately, only utilizing distance in the embedding space may result in disagreement from the algorithm and players.
Since power has such a complex and contextual definition, can one define power creep for a game as a whole without defining power? Herein, we attempt to do that. We shall build up to such a definition by re-framing what power creep is. Let power creep be dependent on the re-visitation of design space at equal or lower resource cost. Then to calculate power creep without calculating power one must (1) assign a cost for the resource, (2) identity re-visitation of design space, for example, which cards to compare, and (3) assess change over time. Definitions, Terminology, and Notation starts by formalizing the anatomy of a card’s face (Figure 3) from which those cards which are clear improvements on rate for resource can be defined as StrictlyBetter. With the cards that have improved on rate found, the flux of their release over time can finally be analyzed (PowerCreep). An overview of the many attributes of a card’s face.
Definitions, Terminology and Notation
While care is taken to define the relevant attributes of MtG cards, the reader would benefit from some former familiarity of the game and its rules. Additionally, it may benefit the reader to search for mentioned cards via WotC’s official card search engine Gatherer or fan-favorite Scryfall (Scryfall, 2021; W. of the Coast, 2021a).
Card Pool
The pool of cards under consideration is the set of cards with at least one officially supported format legality and is represented as
Card
A card, Examples of the multiple faces of card. Whereas the word face may refer to a side of a geometric rectangular cuboid, here it refers to a functional game object. Partially, this is motivated as printing on the four thinnest sides of a playing card is impractical. More generally, WotC has experimented with a singular playing card have multiple faces on the front (e.g., “Smelt//Herd//Saw”). Therefore, it is conceptually possible to formulate a card with more than two faces.
Of note, all cards have minimal face cardinality of 2, where if WotC only defines
Further, in the case of “Smelt//Herd//Saw,” one can partition C into two subsets C
f
and C
b
, were
With
Card Face
When a MtG player thinks of a prototypical card, perhaps “Llanowar Elves,” they are thinking of card with two faces. Let C = Llanowar Elves, then
Pips
Pips are the symbols used to define a face’s casting cost (amongst other things like cost for ability activation, etc.). Specifically, the casting cost is the mana cost (defined shortly) required for casting the card defined on the card’s title bar. Mana symbols are a subset of pips that specifically relate to generic, colored, and colorless mana.
Briefly, generic mana is mana for which any color or colorless mana can be used to satisfy the cost. Whether or not mana is colored or colorless is a property of the pip. When a colored pip is present, only the color(s) matching the pip’s designation will satisfy any accompanying cost. Likewise, colorless pips strictly require that mana used to pay for those costs are equally devoid of color. For example, the mana cost 1 represents one generic mana for which one green mana (G), one white mana (W), or one colorless mana (C) would satisfy the cost. A review of this terminology can be found under the section Mana Abbreviations (Rosewater, 2009); alternatively a comprehensive table of these symbols can be found on Scryfall’s Color and Costs API documentation page (Scryfall, 2021). The reader is assumed familiarity with the UTF-8 variant of pips in the accompanying examples.
Let
Mana Cost
Let the function pips return the set of pips required to cast the face i of a card C, then the mana cost of a face is the multiset of the pips required to cast the card
Converted Mana Cost
The converted mana cost (cmc) of a face is the total amount of mana (generic, colorless, and colored) required to cast the face
Returning to the example of a mana cost written as
Mana Efficient
Let the function colors return the subset of only mana symbols from a face’s pips 1. Two faces with equivalent mana cost are equally mana efficient in relation to one another, and 2. Two faces with equal cmc can be equally mana efficient if the former reduces colored mana symbols.
This definition may be contentious as it means that the mana cost 4 is not more efficient than 3G, that is, a cost consisting of only generic mana, while loosely may be considered an improvement, is insufficient to qualify. Our rational for this decision stems from our aim to provide a conservative definition, especially as a designer’s choosing to require a color at all may bias the resource cost for an effect (Zuin & Veloso, 2019).
Playability
The play-ability, playable of a face is whether or not that face can be played (in case it is a land) or cast directly from a player’s hand. This is based on the comprehensive rules (W. of the Coast, 2021b)
For clarity outside of the comprehensive rules (W. of the Coast, 2021b), this stems from the first confusing similarity between the definitions of “playing” a card’s face and “casting” one. To cast a face of a card, a cost needs to be paid, even if that cost is zero. Some faces have zero as a mana cost, while others have an undefined mana cost entirely (most notably faces bearing the type Land). The “Agadeem, the Undercrypt” and “Adanto, the First Fort” example, however, requires a more complex look at the rules (W. of the Coast, 2021b). At the time of writing, a card might be playable or castable from hand provided that (1) the card has a mana cost or (2) the card is either a land on the front of a card or the card is a MDFC. While both “Agadeem, the Undercrypt” and “Adanto, the First Fort” are defined on C b , the card “Legion’s Landing//Adanto, the First Fort” in its entirety is not a MDFC. Thus, “Adanto, the First Fort” is accessible by first casting “Legion’s Landing,” then triggering the rules text to change the card’s face. The definition of play-ability may require adjustment in the future depending on changes to the rules.
In short, the example of “Agadeem, the Undercrypt” and “Adanto, the First Fort” serves to emphasize the requirement of constraining directly playable cards, for example,
Rules Text
The rules text of a face,
Two faces are rules equivalent when each corpus of rules text is a subset of another
Supertypes, Types, and Subtypes
Each card face has at least one type. Additionally, they may have a supertype and/or a sub-type. Generally speaking, faces with “creature” amongst their types have at least one sub-type.
Let
Combat Stats
If
Then we can say a face
Rarity
At the time of writing there are four rarities: common, uncommon, rare, and mythic. Additionally, regardless of the number of faces a card has, all faces on a card share the same rarity.
Rarity, with the exclusion of the card printed for charity “Rarity” (which has no legal formats), impacts only two formats: draft and sealed. This impact is that of the card’s frequency in the card pool, not whether or not the card is in it all. As we attempt to address power creep of the game as a whole, which includes many more constructed formats (vintage, legacy, modern, pioneer, historic, brawl, commander, etc.) whether or not a face has a different rarity than another face is of lesser concern.
However, there are formats, such as pauper, where the card pool is limited by rarity. Additionally, depending on the intent of the limited environment (draft or sealed) a reprinted card may undergo a rarity shift, that is, being reprinted with a different rarity than its initial printing (e.g., “Alabaster Mage” had the uncommon rarity in the expansion Magic 2012 but was reprinted with a common rarity in Double Masters). This furthers the notion that rarity is a construct for a subset of formats rather than of impact during a game where the card is legal. Lastly, MtG was designed so that rarity was not a correlate of power (Rosewater, 2005). Hence, for the purpose of this article rarity is excluded.
Release
There are many expansions to MtG. Let the initial release date of a card be represented as release′ (C). Then we can represent a card A being released after another card B as release′ (A) > release′ (B). As all faces of a card are released at the same time
Preexisting Pool
Let the preexisting card pool
Functional Reprints
So far, we have discussed many aspects of a card’s face, however, we have yet discussed a face’s name. Normally, a format limits the number of copies of a card (and accordingly its faces) to up to four copies in a deck, or up to one if it is a “singleton” format. While there are cards with rules text that get around this limitation (e.g., “Persistent Petitioners”) there is another way to get around this. WotC has often printed “functional reprints” of previously existing card. Here, we define a card’s face to be a functional reprint of another if it: 1. Both are directly playable, 2. Has equivalent mana cost, 3. Has equivalent supertypes, types and subtypes, 4. Has equivalent rules text, 5. If both faces are creatures they have equivalent combat stats, and 6. Has been released after the other.
Thus
Strict Comparisons
Strictly Better and Strictly Worse
Here we will attempt to codify a sufficient, albeit not all encompassing, definition of “strictly better.” Normally, when a player says a card is “strictly better” they are roughly saying that card A has the same effects as card B for a reduced cmc (resource cost), or A has the same effects as B at the same cmc with “upside.” There is a lot of wiggle room within the word “upside.”
Therefore, we will use a far more conservative definition. Given face i of card A and face j of card B, 1. Both faces are playable, 2. 3. Both faces have equivalent rules text, 4. Both faces share all supertypes, types and subtypes, 5. If both faces are creatures then 6.
Then we can define the function StrictlyBetter as
Strictly Better and Strictly Worse at Release
While the above definition of strictly better clearly captures a card face that provides equal or better mana cost for equal or better combat stats and equivalent rules text (without being a functional reprint), it falls short; power creep occurs over time and it is therefore relevant to also define strictly better at time of release in order to prevent counting a
We can define the function
Sets of Faces
With these definitions of strictly better at release, strictly worse at release and functional reprints, we can define the sets of card faces we will use to define power creep.
We can define the set
Similarly, let the set
Power Creep
Before defining power creep, it may be worthwhile to define some good tenants that a definition of which may reflex: 1. Any card, or face thereof, from the first release of the game should not count toward power creep. There are cards in that set, if released today players would consider power creep. Yet, as they are the foundation of the game, their mere existence cannot have crept the game’s power. 2. Any card, or face thereof, should not be counted twice. A direct reprint of a preexisting card may affect the strength of limited but not the many “eternal” (non-rotating) formats. 3. Any card, or face thereof, should not be scaled by monetary price. While perhaps a correlate of power, external factors may influence this.
Base Model
With a notion of a strictly better card face at release than another, we can conservatively define power creep as the number of strictly better card faces printed in a given duration
Functional Reprints
While these definitions provided a solid base model for power creep, as noted by WotC’s designers, functional reprints are a form of power creep (Stoddard, 2013b). Thus, an extension to this model can be defined as
Escher Stairwell
The role of strictly worse at release card faces in power creep is unclear. WotC may be implying that these faces are part of the Escher Stairwell approach to help manage power; however, players might argue that they do not, as players would not play a strictly worse card if the alternative is available. Should the power creep of the game be allowed to arrest, it is doubtful that the power of the game as a whole would ever recede. Yet, the goal here is to provide a conservative metric to see the base rate of power creep. Thus, we define the base Escher Stairwell Power Creep model as
Additionally, as MtG has increased the number of new card faces released each year let the normalized Escher Stairwell Power Creep model be defined as
Results
The first few years of the game saw no strictly better card faces released (Figure 5). From 1997 to ≈2014 strictly better faces were released sporadically. However, starting in 2015 we see a marked shift in the number being released. This stands in stark contrast to the functional reprint policy, which spiked much earlier on. As for why this is the case, one might hypothesis the creation of a new format such as modern (2011) or WotC’s attention being drawn to what is now the most popular format, commander, with the release of pre-constructed commander decks (2011). Since WotC designers generally work two to 3 years in advance, the (official) adoption of these two formats may have motivated the creation of more powerful cards later. Hypotheses aside, for one of the longest standing card games, the cumulative number of strictly better card faces over its 25 years seems remarkably low (Figure 6). The strict version of our conservative metric for solely strictly better cards does not even pass one-hundred card faces. A relaxing of the metric sees an influx, however, even including reprints only nears 500. With more than 20,000 unique cards (even more card faces), this is less than 2% of the card pool. Including functional reprints pushes this over the 2% mark. The number of strictly better card faces released each year. The earliest strictly better face was released in 1997, however, the expansion block of Portal, Portal Second Age, and Portal Three Kingdoms saw an influx of functional reprints. The cumulative release of strictly better card faces. The influx of functional reprints from the Portal block expansion sets is clear in subfigure 6c.

While the cumulative rate of power creep is not monotonic, it has a rising trajectory in the case of our base model. This is not surprising as to stabilize the rate, less than one new strictly better card face can be released in a year and to reverse it none for several years. Looking at the expanded model with functional reprints included, the cumulative rate of power creep take a substantial dip in the early 2000s; yet this rate is now recovering. Interestingly, although players point toward a new design philosophy starting in 2019 named “F.I.R.E” as the source of the mass bannings in 2020, the increase in the rate of power creep predates F.I.R.E. design (see Figure 7) (Stoddard, 2019). (Figure 7) The cumulative rate of power creep, that is, the total number of strictly better faces released divided by the duration of the games existence with (subfigures 7c, 7d) and without (subfigures 7a, 7b) functional reprints. While functional reprints substantially spike the cumulative rate, this rate nearly recovers by 2020.
With a year span of one, with the exception of the strict, reprint inclusive normalized Escher Stairwell Power Creep model, all models show an upward trajectory for the number of power crept faces released each year (Figure 8). This reflects the functional reprint influx from the Portal expansion block. Additionally, these models show that there are years where the number of strictly worse card faces at release and functional reprints thereof are greater than the strictly better card faces. However, even under the assumption that worse card faces can weaken the game, the cumulative normalized Escher Stairwell Power Creep model shows that power creep is nonetheless on an upward trajectory. Of note, this trajectory does not exist solely due to more cards being released each year (Figure 9). For a break down of power creep by type see Table 1. The normalized Escher Stairwell Power Creep model. The difference between strictly better card faces at released and strictly worse card faces with/without functional reprints over the number of new faces released. The cumulative normalized Escher Stairwell Power Creep model. The difference between strictly better card faces at released and strictly worse card faces with/without functional reprints over the number of new faces released. Breakdown of the types (supertype and type) of the strictly better card faces for the strict and relaxed definition. Relaxing the sub-type restriction yields an influx of creatures.

Discussion
Curious Examples of Power Creep
To highlight how exactly these definitions work in practice we find it useful to walk through two examples.
For the first example consider the faces, in release order, “Leonin Scimitar,” “Veteran’s Sidearm,” “Honed Khopesh,” and “Short Sword.” The only differences between these cards are: 1. “Veteran’s Sidearm” costs two cmc, whereas all others only one, and 2. The name of each card.
“Leonin Scimitar” is not strictly better at release than “Veteran’s Sidearm” as it was printed prior to it; consequentially ‘Veteran’s Sidearm” is strictly worse at release than the scimitar. “Honed Khopesh” and the “Short Sword” are functional reprints of the scimitar and are therefore part of
For the second example consider the faces, in release order, “Regrowth,” “Elven Cache,” “Recollect,” and “Bala Ged Recovery.”
“Elven Cache” is strictly worse at release than “Regrowth.” While “Recollect” is strictly better than “Elven Cache” it is not strictly better at release as it is strictly worse than “Regrowth” printed prior. Thus, “Recollect” is strictly worse at release. “Bala Ged Recovery” is a functional reprint of “Recollect” as this metric occurs on the face level. Of note, “Recollect” has face Ø for its second face, whereas “Bala Ged Recovery” has the second face “Bala Ged Sanctuary.” Players may say that makes the card strictly better. However, as discussed below with companions, wordiness, etc., it is not without reason that cards without the face Ø may have an intrinsic down side, for example, if WotC prints a card that prohibits opponents from casting them.
StrictlyBetter as a metric
With any metric there are strengths and weakness; our definition is no exception. Foremost, as it is largely dependent on the conservative metric of StrictlyBetter, this value moving forward could be manipulated by WotC. Knowing that their game could be evaluated in such a way may intentionally or unintentionally bias the company or its employees to design cards with slight tweaks to the rules text to reduce the size of strictly better card faces. Additionally, as mentioned in the general conceived notion of “strictly better,” this metric does not attempt to address card faces with equivalent cmc and “upside.” Further, if a card face is strictly better than several others it is only counted once.
In this metrics favor, as “upside” is not considered, this metric can be calculated exhaustively without human annotation. To its benefit, our definition of strictly better does not punish WotC’s innovation, for example, when a new expansion adds cards to the card pool that have new keywords or rules text, those card faces cannot be strictly better. Further, the expanded model, which includes functional reprints adheres to WotC’s own interpretation of power creep as well as showing substantial mitigating during the early 2000s (Stoddard, 2013b).
To continue questioning our notion of strictly better, WotC has shown repeated interest in card faces with rules text that address the cmc of a card, for example, “Void Winnower.” Such cards existence required the addition of “given no continuous effects or triggers” to the function StrictlyBetter as otherwise an even cmc face that is strictly better than an odd cmc face would, in some instances, be unplayable. Such is evident more so with the even more recently printed companion cards.
Companion, is a keyword in the rules text which can effect deck construction prior to the game actually being played. Since “Gyruda, Doom of Depths,” “Obosh, the Preypiercer,” and “Lurrus of the Dream-Den” are concerned with cmc, one could argue that any change in cmc is actually
Along this line of reasoning, one could postulate that any change in mana cost would be potentially detrimental due to another companion “Jegantha, the Wellspring” which cares specifically about the type of pips in the mana cost. That is, despite mechanics, like devotion, which care about types of pips in a face’s mana cost, our definition generally overlooks this through the given statement in the StrictlyBetter definition (namely, the function manaefficient). Fortunately, there is not yet a card face with companion that is concerned with power and toughness.
Thus, a possibly more sound definition would consider only card faces which are equivalent in all ways except for a) combat worthiness and b) having rules text specifically with “upside.” However, WotC has shown putative interest in designing faces with effects that concern the amount of rules text as well, that is, “Alexander Clamilton.” While players may dismiss this as a “joke” card, the companion “Zirda, the Dawnwaker” concerns itself with other cards having activated abilities (a form of ‘wordiness’). Therefore, the notion of a “strictly better” card face, which is true 100% of the time regardless of context, is moot. Tangentially, while this metric is not directly usable in other TCGs its core tenants might be, that is, finding card faces which are both directly playable and are otherwise identical except for combat worthiness and cost to cast/play the card.
Lastly, there are instances where WotC prints two faces on separate cards during different years each of which are strictly better than a third face. In other words, two (or more) equivalent designs with different names that are better than another older design, but one of the first card faces is a functional reprint of the other(s). The expanded model will count each unique instance, while the base model does not; although, such occurrences are rare. This may be a point of contention with players perspective of what power creep is. Nevertheless, we feel confident that the conservative metric of strictly better, as well as the base and expanded models of power creep, introduced here is sufficiently stringent to be of use.
PowerCreep as a metric
Given the definition provided here one could easily ask the converse, that is, “What about faces which are strictly worse?” While a fascinating question, adding “strictly worse” cards to the card pool does not weaken the game. Outside of the two limited formats, players are not obligated to play with these cards. Given the choice between two card faces, one being strictly worse than the other, for competitive play the choice is trivial. In other words, introducing weaker faces does not mitigate the existence of stronger faces in the game. Correspondingly, it does not contribute to the definition of PowerCreep. Nonetheless, as we attempt to provide a conservative metric to find the base level of power creep in the game, we included it in the Escher Stairwell model (Figures 8 and 9). While greatly reduced, in three of the four variants the yearly power creep maintains a positive slope (Figure 9) and always has a positive slope in the cumulative variants (Figure 8).
On the other hand, many of the card faces identified here as strictly better do not see much, if any, play. Therefore, although these cards are strict evidence of power creep, players may attest that these are not the cards which will cause the game to collapse inward on itself (Rosewater, 2016).
In the introduction, we elude to the relevance of resource cost in the definition but not sole determinate of power creep. Such raises the question of whether or not this formulation—power creep as the re-visitation of design space at equal or lower resource cost—generalizes to mediums outside of TCGs and CCGs. For example, consider a first-player shooter (FPS) video-game where a gun, in an update, has its reload rate lowered. The “resource cost” of the gun might be defined by any number of additional variables (paid downloadable content, time to unlock via a quest or mission, weight to equip the item, etc.) or the item might have been free and accessible since the game began. Surely, a tweak to the item’s properties qualifies as re-visitation of design space. While the definition provided here is tailored for MtG, adjustment may be applicable elsewhere.
Power Creep as Defined versus Public Perception
As our definition of power creep utilizes a conservative definition of card faces which are strictly better (StrictlyBetter), PowerCreep generally addresses card faces which are not the source of scrutiny. The 2018 card “Wishcoin Crab” and it’s first face being strictly better than the first face of the 2011 “Armored Cancrix” is not what pops into players head when they lambaste the game of power creep. Rather, they think of cards like “Oko, Thief of Crowns” and “Omnath, Locus of Creation,” cards they believed were printed to increase sales (Chen et al., 2018; Perreault et al., 2021; Zuin & Veloso, 2019). Consequentially, given how restrictive our metric is, PowerCreep is the floor. Further, it is generally accepted that power creep will exist in any long running game; however, the rate of which is what concerns players. Finding an increasing rate, with a notable spike around 2014–2015, shows a putative shift in design prior to WotC announcing F.I.R.E design in 2019 (Stoddard, 2019).
Future Directions
Here, we presented three core models: the base model which solely concerns itself with strictly better at release card faces, the functional reprint inclusive model which factors in non-detrimental functional reprints, and the Escher Stairwell model which considers the quantity of strictly better cards at release versus those that are strictly worse. Each of these models may relax the sub-type restriction as well as normalize the rate to the newly release card faces each year. Currently, they conservatively capture the existence of power creep within the game, although it is marginal. Yet, there are a myriad of other factors to incorporate into these models to more accurately represent power creep. Such include the rate of card faces banned and how many formats in which the bans occur, card faces with “upside,” card faces with “downside,” format specific terms, etc.
Summary and Conclusions
Here, we introduce a conservative metric for defining strictly better faces of cards, a base, and expanded model of power creep in the game MtG. Although it does not touch the cards which players most likely imagine it would, it does capture a swath of cards reasonably distributed across types. The most relaxed definition suggesting only a total of less than 2.5% of cards printed being strictly better or functional reprints thereof, this metric seemingly confirms that WotC has successfully limited its occurrence, thus preventing the game from “failing” due to power creep. Conversely, it also shows a stark increase in the printings of strictly better cards and an increase rate of power creep that may relate to the unprecedented number of bannings in 2020 (partially due to more cards being released per year). Our work provides a solid foundation for 1.) more encompassing definitions of power creep as well as 2.) analysis of power creep in relation to other pertinent topics (e.g., correlating with sales data, amount of bannings, etc.). We believe these metrics are a good step forward at codifying power creep.
If we could press upon the reader three points to take away from this paper, they are as follows: 1. A perspective shift of viewing power creep as “greater power” in the game towards the re-visitation of design space at equal or lower resource cost may be beneficial in conceptualizing what constitute power. 2. A sensible definition of power creep is achievable without defining power via extreme constraints on qualifying game piece candidates. 3. Although many identified power crept game pieces here will likely be scoffed at by the players of the game due to “lack of power,” an automated search for identifying such pieces at the perceived lowest levels of power may further claims of power creep being pervasive. Power creeping the worst pieces in the game strongly suggest that such is necessary due to an overall increase of power in the game. Even when factoring in intentional decreases of power, the number of strictly better card faces released result in a positive rate of power creep.
Footnotes
Acknowledgments
Sincere thanks are extended towards Meghan Burden and Michael Rappe for the engaging and thoughtful topical discussions.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
