Testing Stoneforge Mystic in Modern: Part One

The banned list is one of the hot Modern topics whenever a new set is released. Everyone is speculating about what, if anything, will get the ax or be unleashed upon the world. Speculation this time is focused on Infect and/or Dredge taking a hit and Bloodbraid Elf coming off the list. I’m not here to ad to the speculation but instead provide hard data on whether an unrelated card should come off.

stoneforge-mystic-banner-cropped

I have been hinting at (and making excuses for) this article for weeks now. The time has finally come for me to publish my findings. Today I begin presenting the results of my investigation into the viability of unbanning Stoneforge Mystic. It will be quite long, so today will present the setup and methodology and next week I will actually present my data.

The Prelude

Long time readers may remember that last December Sheridan tested Stoneforge Mystic in an Abzan list against Afffinity. What he found was that the option for a turn-three Batterskull did not significantly impact the matchup game 1 and that sideboard cards played a much larger role in giving Abzan a 50% win rate against Affinity. For reference, here’s the deck Sheridan used:

Stoneforge Abzan, by Sheridan Lardner (Original Test Deck)

Creatures (14)
Stoneforge Mystic
Tarmogoyf
Scavenging Ooze
Siege Rhino
Tasigur, the Golden Fang

Artifacts (2)
Batterskull
Sword of Feast and Famine

Instants (7)
Abrupt Decay
Path to Exile

Planeswalkers (3)
Liliana of the Veil

Sorceries (10)
Inquisition of Kozilek
Thoughtseize
Lingering Souls
Maelstrom Pulse

Lands (24)
Verdant Catacombs
Marsh Flats
Windswept Heath
Stirring Wildwood
Shambling Vent
Twilight Mire
Forest
Plains
Swamp
Overgrown Tomb
Temple Garden
Godless Shrine
Gavony Township
Ghost Quarter
Sideboard (15)
Sword of Fire and Ice
Stony Silence
Maelstrom Pulse
Scavenging Ooze
Slaughter Pact
Engineered Explosives
Fulminator Mage
Nihil Spellbomb
Duress
Liliana of the Veil
Buy deck on Cardhoarder (MTGO)Buy deck on TCGPlayer (Paper)

I don’t doubt his results are accurate, but I don’t think they really tell the story. Affinity has plenty of ways to get around Batterskull so I never expected Stoneforge to have much effect there. Affinity is a “fair” deck (I really need to come up with a better term for that kind of deck) and can ignore most of what Abzan is doing. What I was always interested in was the effect it would have on fair decks, and Sheridan never got a chance to test those.

Expanded Scope

Additionally, Sheridan mentioned that he wanted to do more testing with other decks, so I started gathering data for him. Specifically I started testing a TwinBlade deck, which was Jeskai Twin with Stoneforge Mystic and a pair of Batterskulls. I was mostly done with data collection when Splinter Twin got banned, rendering it all moot.

Splinter TwinWhat I can say about TwinBlade was that it was a nightmare to play against. I tested Burn and was working on Jund and Stoneforge had a noticeable, trending toward significant, impact on both matchups. Burn traditionally had trouble against Twin because it couldn’t win quickly enough to beat the combo when Twin had some interaction while the consensus of Twin vs. Jund was that it was 50/50.

The addition of Mystic definitively pushed Twin over Burn. Repeatable lifegain is unsurprisingly hard for Burn to beat, and trying to do so left them open to being comboed out. Jund was also losing ground, though I was never certain if that was due to Mystic herself or if we were just playing the matchups poorly. Trying to defend against the combo and Batterskull spread Jund pretty thin, but that might have been player error.

In any case, the threat of that deck was going to lead me to recommend that Mystic never be unbanned. With Twin gone, I thought it worth looking into again.

Establish Procedures

Having decided to test out Stoneforge, and that I wanted to provide a definitive answer about its impact, I knew that meant I had to test a lot of decks. The problem was that there isn’t as strong a consensus about Abzan’s other matchups besides Affinity. I decided to establish a baseline myself. This would involve playing a stock Abzan list against a test gauntlet and then running it again with the Stoneforge list. After some scouring, this is what I came up with:

Stock Abzan, by David Ernenwein (Test Deck)

Creatures (13)
Tarmogoyf
Siege Rhino
Scavenging Ooze
Tasigur, the Golden Fang

Instants (7)
Path to Exile
Abrupt Decay

Planeswalkers (3)
Liliana of the Veil

Sorceries (13)
Inquisition of Kozilek
Thoughtseize
Lingering Souls
Painful Truths
Maelstrom Pulse

Lands (24)
Verdant Catacombs
Marsh Flats
Overgrown Tomb
Shambling Vent
Twilight Mire
Windswept Heath
Swamp
Hissing Quagmire
Godless Shrine
Gavony Township
Plains
Temple Garden
Stirring Wildwood
Forest
Sideboard (15)
Engineered Explosives
Fulminator Mage
Timely Reinforcements
Curse of Death’s Hold
Stony Silence
Creeping Corrosion
Surgical Extraction
Damnation
Painful Truths
Buy deck on Cardhoarder (MTGO)Buy deck on TCGPlayer (Paper)

Keep in mind that I began the process in late June, so the Grim Flayer and Collective Brutality technology didn’t exist at the time. At this time I also decided that I wanted to use Sheridan’s results in my final analysis since it was an already complied data point. To make this work I would be using his list for the actual testing, which was not a problem since at the time Abzan hadn’t dramatically evolved since December.

The Gauntlet

I wanted a mix of fair and less-fair decks for my gauntlet. I also wanted the results to be applicable to the metagame as it existed when I began. Complaints about linearity and aggro saturation were particularly high at the time, I so settled upon some fair and unfair linear aggro and the most successful truly unfair deck in Modern. The other consideration was that I wanted decks where Mystic could have an impact. I doubt very strongly that Tron cares about an artifact that’s smaller than Wurmcoil Engine, and I wanted to improve the chances of results worth reporting.

I also made sure to go as stock as possible with these lists. I wanted the most representative results as possible, and the less common builds could have skewed things. This was difficult for Burn and Infect as everyone has their own take and I ended up aggregating them to find the “average” deck. The rest seemed to be pretty close to consensus and were relatively easy. As a bonus, the decks had sideboards that were reasonable in a Mystic-fueled Modern.

Burn, by David Ernenwein (Test Deck)

Creatures (14)
Goblin Guide
Monastery Swiftspear
Eidolon of the Great Revel
Grim Lavamancer

Instants (18)
Lightning Bolt
Searing Blaze
Skullcrack
Boros Charm
Atarka’s Command

Sorceries (8)
Rift Bolt
Lava Spike

Lands (20)
Bloodstained Mire
Wooded Foothills
Scalding Tarn
Sacred Foundry
Mountain
Stomping Ground
Sideboard (15)
Destructive Revelry
Deflecting Palm
Lightning Helix
Path to Exile
Searing Blaze
Skullcrack
Grim Lavamancer
Buy deck on Cardhoarder (MTGO)Buy deck on TCGPlayer (Paper)

If traditional Naya or 5-Color Zoo had any metagame presence at the time I would have gone with those as they’re closer to what players think of when we talk about fast aggressive decks. The Burn decks that run Wild Nacatl may have a different result than this more traditional list, but the version above is still widely represented and there is considerable dissent about which is better.

Burn was a good choice for the red side of aggro, but as for the non-red I really had only one choice. I wanted top-tier decks that had proven themselves and when I started, there was only one deck that fit the criteria.

Mono-Blue Merfolk, by David Ernenwein (Test Deck)

Creatures (28)
Cursecatcher
Silvergill Adept
Lord of Atlantis
Master of the Pearl Trident
Harbinger of the Tides
Tidebinder Mage
Merrow Reejerey
Master of Waves
Kira, Great Glass-Spinner

Artifacts (4)
Aether Vial

Enchantments (4)
Spreading Seas

Instants (4)
Dismember
Spell Pierce

Lands (20)
10 Island
Mutavault
Cavern of Souls
Minamo, School at Water’s Edge
Oboro, Palace in the Clouds
Sideboard (15)
Tectonic Edge
Relic of Progenitus
Hurkyl’s Recall
Gut Shot
Negate
Buy deck on Cardhoarder (MTGO)Buy deck on TCGPlayer (Paper)

Honestly, even if Merfolk wasn’t Tier 1 I would have tested it anyway. It’s my deck and I want to know what effect Mystic would have on it. Testing with this deck also reminded me why I play UW Merfolk instead. I ended up missing Path to Exile and Echoing Truth, as well as my sideboard, and being underwhelmed by Harbinger. Still, I’m the only one playing that version, so I played the same deck everyone else does.

And then we have the most complained-about deck (that isn’t Dredge).

Infect, by David Ernenwein (Test Deck)

Creatures (12)
Glistener Elf
Blighted Agent
Noble Hierarch

Instants (24)
Might of Old Krosa
Mutagenic Growth
Vines of Vastwood
Become Immense
Apostle’s Blessing
Spell Pierce
Twisted Image
Dismember
Distortion Strike

Sorceries (4)
Gitaxian Probe

Lands (20)
Inkmoth Nexus
Misty Rainforest
Forest
Breeding Pool
Verdant Catacombs
Windswept Heath
Pendelhaven
Wooded Foothills
Sideboard (15)
Grafdigger’s Cage
Spellskite
Kitchen Finks
Dismember
Dispel
Nature’s Claim
Twisted Image
Dryad Arbor
Buy deck on Cardhoarder (MTGO)Buy deck on TCGPlayer (Paper)

Infect has the fastest kill in the format, but it’s fairly vulnerable to Jund and Abzan’s disruption, and like Affinity it can ignore Batterskull. This would really show how powerful a threat it is rather than just acting as a wall and lifegain source.

Ad Naus is the most successful unfair deck in Modern now. Grishoalbrand is more broken but also inconsistent, and rarely appears on our tiering charts. Scapeshift is a fair deck and Titan Breach really wasn’t a deck when I started.

Ad Nauseam, by David Ernenwein (Test Deck)

Creatures (5)
Simian Spirit Guide
Laboratory Maniac

Artifacts (8)
Lotus Bloom
Pentad Prism

Enchantments (4)
Phyrexian Unlife

Instants (15)
Ad Nauseam
Angel’s Grace
Spoils of the Vault
Pact of Negation
Lightning Storm

Sorceries (8)
Sleight of Hand
Serum Visions

Lands (20)
Temple of Deceit
Gemstone Mine
Seachrome Coast
Darkslick Shores
Temple of Enlightenment
Island
Plains
Sideboard (15)
Spellskite
Leyline of Sanctity
Echoing Truth
Hurkyl’s Recall
Pact of Negation
Slaughter Pact
Thoughtseize
Boseiju, Who Shelters All
Buy deck on Cardhoarder (MTGO)Buy deck on TCGPlayer (Paper)

Despite what was said during the World Championship I think the matchup of Abzan vs. combo decks is pretty even. When Abzan goes Inquisition, Tarmogoyf, Liliana, it’s hard to lose. If it doesn’t get the right disruption or a decent clock it will lose. Testing would be focused on whether Mystic improves the clock enough to shift the matchup.

Project Creep

I was proceeding through testing all these decks when I began to notice a trend in the data. This trend was interesting enough to want to confirm the result, despite the exhaustion all this Magic was causing. However with PPTQ season getting underway I didn’t think that was possible. Then I won won the first one and suddenly I didn’t need to test for real anymore. With my ticket to the RPTQ punched (Congratulations to Jordan for doing the same) I had the time to actually test more decks. To confirm the data trend I would need another fair deck and a less fair one. Thus I added two more decks to my gauntlet.

Jund is the poster child for fair decks and I would have gone with it if I could have found a Jund player to test with. I didn’t, but a Jeskai player volunteered, and Jeskai will do.

Jeskai Control, by David Ernenwein (Test Deck)

Creatures (6)
Snapcaster Mage
Vendilion Clique
Emrakul, the Aeons Torn

Instants (19)
Lightning Bolt
Path to Exile
Spell Snare
Mana Leak
Remand
Lightning Helix
Timely Reinforcements

Planeswalkers (4)
Nahiri, the Harbinger

Sorceries (8)
Serum Visions
Ancestral Vision
Anger of the Gods

Lands (23)
Celestial Colonnade
Flooded Strand
Scalding Tarn
Island
Steam Vents
Sulfur Falls
Mountain
Plains
Arid Mesa
Ghost Quarter
Hallowed Fountain
Sacred Foundry
Sideboard (15)
Engineered Explosives
Spreading Seas
Stony Silence
Celestial Purge
Dispel
Negate
Wear // Tear
Geist of Saint Traft
Crumble to Dust
Timely Reinforcements
Wrath of God
Buy deck on Cardhoarder (MTGO)Buy deck on TCGPlayer (Paper)

Dredge seemed like a good candidate for the unfair deck. It was the new hotness at the time and while I didn’t think turn-three Batterskull would be good, that was actually in line with the phenomenon I wanted to test. The problem was that after the practice matches it was clear that Abzan’s win percentage game one was too low and the sideboard matches too swingy for me to consider the data valid. Abandoning that, I looked at the current tiered unfair decks and went with Death’s Shadow.

Death's Shadow Zoo, by David Ernenwein (Test Deck)

Creatures (19)
Death’s Shadow
Monastery Swiftspear
Wild Nacatl
Street Wraith
Steppe Lynx

Artifacts (4)
Mishra’s Bauble

Instants (13)
Mutagenic Growth
Temur Battle Rage
Become immense
Lightning Bolt

Sorceries (7)
Gitaxian Probe
Thoughtseize

Lands (17)
Windswept Heath
Bloodstained Mire
Verdant Catacombs
Arid Mesa
Blood Crypt
Godless Shrine
Overgrown Tomb
Sacred Foundry
Stomping Ground
Wooded Foothills
Sideboard (15)
Hooting Mandrills
Grafdigger’s Cage
Stony Silence
Phyrexian Unlife
Ancient Grudge
Dismember
Tarmogoyf
Natural State
Inquisition of Kozilek
Pyroclasm
Forest
Buy deck on Cardhoarder (MTGO)Buy deck on TCGPlayer (Paper)

Death’s Shadow presents itself as another Zoo deck but with an unfair fast win, coupled with consistency, that pushes aggro decks out of fair territory. On reflection, picking a deck that straddles fair and unfair is the best indication of what Stoneforge will actually do to both. Tracking the fair Zoo style wins versus the Become Immense wins proved enlightening.

Adding all these decks to the gauntlet and finding experienced pilots to work with added several weeks to the project. For anyone looking to perform a similar test, take care to limit yourself and keep your curiosity in check or project creep like this will ruin you. If I wasn’t butting up against the next banned announcement I might still be collecting data. Which brings us to my actual methodology.

Methodology

I would be playing the Abzan decks. My project, I would do the grunt work. I didn’t want to switch off piloting decks because I wanted to model how these matchups would actually play out in “real Magic,” where players know their decks and know the matchups.Stoneforge Mystic This required finding experienced pilots who were as crazy as I am, who specialized in the decks I wanted to test, and were willing to use these stock lists (on which I negotiated with a few on what actually went into the lists).

This was about as hard as you’d think, especially when I explained the scale of the project. In the end I found online players for Burn, Merfolk, Death’s Shadow, and Ad Nauseam. The previously codenamed “Elliot” agreed to pilot Infect and then Jeskai in paper after some begging persuasion. As I’m writing this my online partners have not told me how they want to be credited. If I get responses, I will add them in.

Test Parameters

I ambitiously set the target of 100 matches per deck, 50 with the “normal” configuration and 50 with Stoneforge. This actually isn’t a large enough n value for a true statistical study, but it would be reasonably representative. Play/draw was alternated with the initial decision based on coin flip, ensuring 25 games a piece on the play for each deck. Sideboarding was included, and will be included in the discussion of the data next week. The testing was conducted over a number of sessions due to scheduling concerns/MODO crashes. “Elliot” testing was done in person, the rest were online.

lightning-stormDuring the Ad Naus sessions we made special consideration for how Lightning Storm doesn’t really work online. We both knew what was supposed to happen, so if that wasn’t reflected by the interface we discussed what would have actually happened in paper and recorded that result. Misclicks were also accounted for, with some matches thrown out and repeated.

Prior to the actual test games a minimum of ten practice games were played against each Abzan deck so that we could get our eyes in and get a feel for the matchup to better mimic Stoneforge actually being legal. It also helped us to get the “correct” sideboarding strategy worked out. Once that was decided upon it was not changed for the duration of testing, even when we later concluded in several cases that there was a better strategy.

Next Stop: Enlightenment

Let me begin concluding by saying that this was not a fun exercise, but it was educational and I am a better player for the effort. Magic should be fun, and this grinding was exhausting and enraging (my distaste for MODO approached a burning hatred many times). It will be a while before I try this again, and probably longer before I find anyone willing to join in my madness.

Next week I will present the sideboarding strategies and win percentages, and explain what it all means. See you then!

Read about David’s conclusions in his subsequent article, Testing Stoneforge Mystic in Modern: Part Two.

David began playing Magic during Odyssey block, quit playing Magic when Caw Blade ruled the world, and returned to Modern shortly before Deathrite was banned. He’s made an appearance at the Pro Tour, made money at GP Denver, and is constantly grinding and brewing in Modern.

20 thoughts on “Testing Stoneforge Mystic in Modern: Part One

  1. Can’t wait to see the results. I’d also be curious to see how SfM would do when slotted in to other decks (such as Jeskai), but obviously that would require another gauntlet. I supposed seeing her impact across so many match-ups even in only one deck should give us a pretty good idea whether or not she’s safe. For the record, my prediction is that she is. We’ll see next week if I’m right!

    1. I can’t answer that, partially because I’ve been sitting on mine since Cawblade, partially because it would be a huge spoiler, and partially because WOTC is a loose cannon and you never know with them.

      1. I wouldn’t go that far, but I bought my thopter/swords two banlist updates before they came off, and got my AV right before it got unbanned. You have to think about what’s realistically coming off in the near future. Buying hypergenesis and skullclamps right now is probably not a good use of money, even if they get unbanned 5 years from now.

        I could imagine stoneforge and bloodbraid coming off in the not too distant future (though probably not the kaladesh update) and I could imagine the artifact lands coming off and mox opal going on. The big difference with stoneforge is its a pretty expensive card to invest in if it stays on the list for another couple years.

        1. My opinion on this topic is:

          Things which are realistic to come of the list within a couple of years:

          SfM, Preordain, DTT, Dark Depths (no Tier 1 deck + enables a new strategy and is good as a control/value finisher and bad as a combo finisher)

          Stuff which is likely but will take time:

          BBE, Seething Song, Artifact lands (they actually make Affinity worse XD), DRS

          Greetings,
          Kathal

          1. I hope you’re right except for the artifact lands. They’re still banned due to krack-clan ironworks not affinity (they indeed make it worse). Stoneforge, Preordain, Dig through time and Dark Depths all together sound a little worrying, but it would be nice to see them in a modern tournament.

  2. Awesome idea, really! As a huge advocate of having as little of a ban list as possible (while of course promoting a healthy format) and a lover of White who wishes it had a bigger role in Modern, I look forward to seeing the results. Quick question; why did you decide to test SFM in Abzan? Is it because you were continuing Sheridan’s experiment, or is it because you think that’s the natural first home for the card?

        1. Probably not, but it would initially be the most played thanks to Legacy. Given what I think the format would end up doing (coming next week) Jeskai would be my guess.

          1. Jeskai, Abzan, Death and Taxes have already been highlighted as some possible deck that could improve with a Stoneforge unban. I think that Affinity too would benefit for having 8 possible plating (and a creature that can carry it). It could even be sleaved up in Bant Knightfall and some builds of Zoo. Aren’t those a bit too many decks? Wouldn’t it hurt metagame diversity? I think that this is the major issue against Stoneforge unban. Not the power level.

          2. Mikefon, I’m actually not too worried about that. It’d be no different than Jeksai Control, Grixis Delver, and Burn all using Lightning Bolt but using the card in very different planned ways. The only problem would be if it takes the decks as a whole too far past the power level of the decks around it.

          3. As a reply to your comment mikefon. Sfm will only be played in affinity the first week before people realize that is terrible in the deck since it is slow and the body doesnt matter. If they wanted that effect they would play the one mana sorcery that does the same thing.
            Saying that it hurts diversity because of the fact that it will be played in a lot of decks is like saying goyf and snap huts diversity of those colors. While they in fact enable different decks to be competative. I think we will Stoneforge help a lot more new decks than overpowering the one that would benefit from that card already.

  3. I must say Im surprised you didnt touch on two cards that I think are very influential to a possible unbanning.
    Collective Brutality and Kommand both deal with stoneforge with extreme efficiency. Im honestly hoping Mardu can become a viable deck on the back of a stoneforge unban.

    1. Answers are not a reason to unban cards, as the old saying goes “There are bad answers. There are no bad threats.” Cards like Kommand give decks legs against Stoneforge, but you could easily play answers like Dispel and the threat will still be just as problematic (Brutality just kills Stoneforge like Abrupt Decay). If I’m testing banned cards I need to know if the threat itself is trouble, not whether or not there are answers to it. A sufficiently powerful threat will see play regardless of how easily answered it is.

Leave a Reply