facebook

CoolStuffInc.com

Star Wars: Unlimited Spark of the Rebellion available now!
   Sign In
Create Account

Pro Tour Data Dive

Reddit

Today, I would like to take some time to talk about something I am quite fond of -- data and Magic. Specifically, I am going to be focusing on the data that came out of Pro Tour Rivals of Ixalan. For those who are unfamiliar with my non-Magic background, I have a masters degree in mathematics and have spent time teaching statistics classes and working with large data sets in the real world.

Before I dive into this data, though, I would like to talk about why the mostly complete data set we have from PTRIX is so important / valuable. One of the most important things when sorting through data is to be aware of biases the data you are looking at has that will skew it in various ways.

The largest bias in most Magic data is simply that it is incomplete. The "metagames" that you see created on websites like MTGGoldfish are sculpted using the decklists posted from Magic Online in addition to top finishes from decks played at paper events. While this makes it fun to look at to find sweet decklists, it is difficult to use this incomplete data to help us figure out which decklists are actually performing well overall.

I know this statement may seem confusing. If a deck Top 8'd a major event, doesn't that mean it performed well at that event? The answer is, it's a bit more complicated than that. To illustrate my point, let's make up some numbers to use as an example. Let's say we somehow had all 1000~ decklists from a Grand Prix or Open and the starting metagame breakdown of the event looked something like this:

  • Deck A: 25%
  • Deck B: 15%
  • Deck C: 10%
  • Deck D: 10%
  • Other Decks: 40%

Skip to the end of this event and we are looking at the Top 64 decklists that are posted. They are broken down in the following manner:

  • Deck A: 16 copies (25%)
  • Deck D: 12 copies (19~%)
  • Deck B: 10 copies (15~%)
  • Deck C: 10 copies (15~%)
  • Other Decks: 16 copies (25%)

Now, just looking at the end results, without knowing the starting distribution of decklists, we would draw the following conclusions:

  • Deck A seems to be the deck to beat. It was 25% of the top finishes
  • Deck D is the second best choice
  • Decks B & C are similar in power level, but worse than A & D
  • There are a lot of reasonable rogue decks in the format and many of them did well

When we know the starting distribution of the data though, these top finishes paint a different story. With a complete picture, I am able to draw the following better conclusions:

  • Deck D was the best performing deck this weekend. Even though it had less top finishes than Deck A, its percentage of top finishes is almost twice its starting amount in the event
  • Deck C did not have as good of a weekend as Deck D, but it did overperform
  • Decks A and B are certainly competitive, but not impressive. Their starting percentage and top finishes are the same
  • Many of the rogue decks in the event fell short. Even though they were 25% of the top finishes, they were a much larger percentage of the starting field.

Do you see how the conclusions we draw from the data change when we have all the information? Hopefully this condensed example helps you understand how I am going to draw conclusions from the Modern data we have from Pro Tour Rivals of Ixalan next.

Pro Tour Rivals of Ixalan Data

For those that want to see the raw data before reading my opinions, I have a Google Spreadsheet you can view here with just some raw numbers in it for everything I am about to discuss here.

One of the reasons data from the Pro Tour is so awesome is because this is the one event where Wizards of the Coast chooses to give us all of the information for Constructed. Specifically, we get to know what archetype every one of the 464 competitors chose for this event to create tables like this:

Let's explain what this table means. The first column is simply archetype names. The second column is the percentage each archetype occupied in the 142 posted decks that went 6-4 or better in the Modern portion of the Pro Tour swiss rounds. So, for instance, Madcap Moon was 4 of the 142 top decks, or 2.8% of them. The third column is the percentage each archetype had at the start of the event. To continue using Madcap as an example, it was 5 of the 464 total decks registered in this event.

Finally, the "Performance Rating" is a metric that I created to compare an archetype's 6-4 or better finishes with its starting portion of the field. Specifically it is:

6-4 or better percentage

Day 1 percentage

While I could have chosen a stricter cut off for the performance rating, I felt simply taking the decks that won over 50% of their matches felt reasonable. The Pro Tour tends to be a tough field, so winning more than you lose is generally fairly respectable.

This means any deck with a Performance Rating over 1.0 overperformed by some amount and any deck with a Performance Rating under 1.0 had a poor weekend overall. Using this information we can paint some broad strokes about the results of this Pro Tour.

First is that every deck with a Performance Rating above 1.0 fell into one of the following four categories:

While this still leaves us with a variety of decks to choose from that performed well, it really helps us condense what style of decks are well positioned in Modern currently.

Looking in the opposite direction, the archetypes with low Performance Ratings were:

  • Blue-Based Control Decks
  • Linear Combo Decks
  • Rogue Decks

Madcap Experiment
Many people, myself included, have often accused Modern of having too much linear combo. This Pro Tour showed that while these decks do see play, they are not all that powerful. This data does, however, seem to validate the feeling that Blue-based control still struggles in this format.

The thing that I found most interesting was the poor performance of the "rogue" decks at this event. Modern is often touted as a "format where you can play anything", but a majority of the people doing just that at this Pro Tour fell short. While a few broke through and had decent finishes, most did not.

Finally, let's talk about a few more specific things that this data implies.

First is that the Madcap Experiment version of "Blue Moon" far out performed the Through the Breach variations. In fact, there were five people who played Breach at this event and not a single one of them finished 6-4 or better. On the other hand, four of the five people who registered Madcap Moon finished 6-4 or better giving this archetype the best performance rating.

Second is that Grixis was far overshadowed by the Traverse variations as the best performing Death's Shadow build. While Grixis did not have a bad weekend, its numbers were fairly average while Traverse Shadow had the third highest Performance Rating over all. If you are looking to play giant one-mana creatures, playing Traverse the Ulvenwald looks like the way to go.

In addition to winning the Pro Tour, Lantern Control overall had a pretty good weekend. I would likely expect an uptick in this deck for both of these reasons. Humans had the largest percentage of the starting field and still managed to convert into a larger percentage of the top finishes. This deck is very good and will definitely continue seeing play.

Finally, non-combo/non-prison control decks like wu and Jeskai Control had big names / teams playing them and still came up pretty short. Yes, a couple of players finished with good records with these archetypes, but by and large their total numbers against the field were very poor.

Wrapping Up

While it is certainly true that having a full data set like we do for Pro Tour Rivals allows our data to be less biased than just looking at top finishes, it is important to be aware of the fact that what we are looking at from this Pro Tour is not entirely flawless on its own. First we need to remember that the Pro Tour is a split format event. This means that there could possibly have been players who could have finished 6-4 or better in Modern, that did not get to play day two because of a poor record in draft.

Second is that, for many of the decks here, especially those with the highest Performance Ratings, the sample size we are working with is fairly small. Larger sample sizes lead to more accurate data, which in turn allows us to draw more accurate conclusions from the result of that data.

At any rate, I hope you found this piece enlightening, or at the very least amusing. Hopefully it helped you get a better understanding for not only why data in Magic is hard to view objectively, but also where Modern is at after this most recent Pro Tour. Have a question about something I did not go into enough detail on above? Let me know in a comment below!

Finally remember - correlation doesn't imply causation.

Cheers,

--Jeff Hoogland

Special thanks to Casper Schaefer for helping me compile 6-4 and better data referenced here.


Rivals of Ixalan is Now Available!

Limited time 30% buy trade in bonus buylist