Overwatch 2 PvP Beta Analysis: How Data and Community Feedback Inform Game Balance

Blizzard Entertainment | May 24, 2022

214 Comments

Overwatch 2 PvP Beta Analysis: How Data and Community Feedback Inform Game Balance

Heroes of Overwatch, let's talk data! Data is a valuable resource when it comes to game design. Data, player feedback, user research, internal discussion, and simply playing the game ourselves, all influence our design decisions. Data can give us a bird’s-eye-view, a close look at the smallest details, and it can illuminate larger trends in our game. Today we’re going to talk about what we can learn from the first Overwatch 2 Beta and how we use data to inform design decisions in-game.

Data and Design: Observing hero performance

The design team wanted to keep an eye on several things as the first beta began. We saw the community play Sojourn and explore the Sombra, Bastion, Doomfist, and Orisa reworks for the first time. We wanted to know how these heroes were performing in the beta and respond with quick changes if they were found to be lacking in effectiveness or too overpowered to play against.

When evaluating hero performance, we look at all competitive ranks and skill levels together and separately to get the clearest picture of how various portions of the playerbase are reacting to the current state of the game. Players that occupy the highest ranks push the game to its limits and often discover the strongest abilities and strategies much faster than the rest of the playerbase. Lower ranked players may, however, struggle more to deal with certain heroes and playstyles, so we find it important to keep all players in mind when making decisions about game balance. Rank-specific analyses are a vital part of our data diet, but this blog features data pulled from all skill levels in the beta.

Quantifying Interest: Metrics that help us evaluate hero popularity

The question of “performance” is multifaceted, and we have several different metrics that shine light through the prism of performance from different angles. The first metric is usage rate: a measure of how often a hero is played out of all time played in matches. For example, if Sojourn was played by a team for five minutes out of a 10-minute match, her usage rate would be 50%. Speaking of Sojourn, her usage rate was very high during the initial weeks of the beta but steadily decreased over time.

Daily usage rates over time for all heroes, all ranks in beta. Sojourn, Ana, Orisa, Sombra, Bastion, and Doomfist are highlighted.

Sojourn peaked at nearly 80% usage rate, which is incredible for a damage hero. Ana also had a very high usage rate during most of the beta, but this is also true in the live game. Orisa and Doomfist, tanks who received heavy reworks, also saw high usage throughout the beta.

Usage rate is an excellent metric for seeing the game exactly as it appears to players in an absolute sense. Sojourn was the most-played hero when the beta first went live; at that point, over half of all time played in the beta featured a Sojourn on both teams. However, there’s more context that needs to be considered when looking at usage rate. Sojourn saw a lot of gameplay in the beta, and her raw usage rate becomes more significant when you factor in how many more damage heroes are in the game to potentially be chosen.

To look at usage rate while also acknowledging this role imbalance, we also consider a different metric called weighted usage rate. Weighted usage rate is measure of how often a hero is played, relative to the number of heroes in its role. More specifically, we take each hero's raw usage rate and divide it by the equilibrium usage for its role: the usage rate where all heroes in a role are played equally. Thus, the final measure in question is the ratio of the hero’s usage vs. this equilibrium rate. Returning to Sojourn, we can see that in the early stages of the beta she was outperforming every other hero’s weighted usage by an incredible rate.

Weighted usage rates over time for all ranks in beta. Sojourn, Ana, Orisa, Sombra, Bastion, and Doomfist are highlighted.

Using weighted usage rate helps us more accurately portray the relative interest in new heroes or reworks like Orisa and Doomfist compared to heroes like Ana, who is already a popular hero. For example, Sojourn peaked at over 6x in weighted usage rate, which means she was played over six times more than the equilibrium rate for damage heroes. Orisa and Doomfist may have only reached into the mid-40%s in raw usage rate, but weighted usage rate helps us see that they were as engaging to tank players as Ana was to support players.

Hero Balance: How data & feedback drive change

Our team had a focused goal for the PvP beta: we wanted all our heroes to be fun to play without being unfair to play against. Our usage rate analyses suggested that we were on track with the former, and thus our focus turned to game balance. Deciding how to balance a hero a hero is a purposeful decision that is made with input from many sources, data-driven or otherwise. For example, consistent feedback from support players regarding the survivability of the role in the beta directly informed many of the support changes in the May 5 balance update.

Player feedback is often one of the first signals that a balance change is needed, and data can help inform these decisions as well. One such way of measuring hero performance is by looking at map win rates. However, due to the nature of Overwatch— namely the ability to switch heroes— basic map win rates are not representative of a hero’s actual performance. If you play Sojourn in a map win, but only for half of the map, should that count as a full map win in a win rate calculation? No!

Fear not, that scenario is only the first step in how we track win rates. To arrive at a better metric, we approach win rates fractionally by looking at how long a certain hero is played within a map. In the above half-map scenario, let’s say that the map lasted 10 minutes and Sojourn was played for half that time. Sojourn would have earned a 0.5 “win fraction” for that map because she was played for five minutes out of 10 minutes. Had the map been a loss, Sojourn would have earned 0.5 “loss fraction.”

To translate these fractions into win rates, we first add all of Sojourn’s win fractions across all maps played. Then we divide the total win fractions by Sojourn's total fractions -- both wins and losses. This approach arrives at a win rate that is much more representative of how the hero is performing while also accounting for hero switching.

One weakness of this metric is that the more a hero is played, the more their win rate is pushed toward 50%. This exact scenario happened in Alpha, where Sojourn was both incredibly powerful and highly played. Because both teams fielded Sojourn, her regular win rate barely budged from 50% despite being over tuned. In a match where both sides have a Sojourn, one team has to win and one team has to lose, after all.

We address this by applying the same win fraction calculation to the periods of time where only one team is fielding that hero— we call this an "unmirrored" state. By looking at unmirrored win rate, we can see how a hero like Sojourn— who reached over 50% mirrored rate— performs when there isn't another Sojourn on the opposing team and separate her win rate further from 50%.

That’s enough explanation about win rate metrics. Let’s look at some unmirrored win rates over time:

Unmirrored win rate over time for all ranks in beta. Sojourn, Orisa, Sombra, Doomfist, Soldier: 76, and Symmetra are highlighted.

This chart illustrates the unmirrored win rates of heroes throughout the beta and why data is only one of many inputs used in making hero balance decisions. It may be surprising to some players to see that Orisa is so low, or that Symmetra is so high! The analysis technique that ultimately makes these metrics useful is a healthy understanding of the context in which they are generated.

Take Symmetra, for example. She consistently has one of the highest win rates in both the beta and live game because people tend to play Symmetra in situations where she is more likely to win, like defending the first point of a map. Players who utilize Symmetra are also more likely to swap off her very quickly when they suspect they may lose the game, further pushing her win rate in a positive direction.

If we apply the reverse of this logic to Sojourn and Orisa, we can better understand why their win rates may be lower than one might suspect. Players were excited to play them in the beta, but everyone was unfamiliar with their new hero abilities and playstyles. This led to players picking them even in losing situations where it may have been more advantageous to swap to a different hero. After all, it is difficult to be as effective at Sojourn with several hours of play on her as an opponent with hundreds of hours on Soldier: 76.

Swimming through all this context and data makes the process of deciding which heroes to focus on for balance updates quite complex. Sojourn’s adjustments were a minor pullback on a series of changes that she received during alpha testing, when her win rate was nearly as much above 50% as it is currently below it. Orisa remains unchanged for now, as it is still unclear if she is bad, or if the community at large is just bad at her. We're leaning towards the latter, as we all seemed to have that one friend who steamrolled through the beta with Orisa and her new kit. However, future changes to Orisa and Doomfist are currently being explored. Soldier: 76 received an update thanks to a healthy mixture of data paired with community feedback, and Sombra also received a similar movement speed adjustment to account for both heroes’ unintended synergy with the new damage hero passive ability. The remaining heroes who were affected in the May 5 balance patch went through similar processes.

Stimulus and Response: Evaluating the results of our updates

Now the fun part begins. Just like our goals entering the beta, we now wanted to analyze the outcome of these balance changes. While it’s important to remember that the win rate changes we’re observing here originate from a limited pool of beta players playing a non-competitive mode, they can still be very useful for rapidly gauging whether balance changes had noticeable effects. Was the change to Soldier: 76 enough to knock his win rate down a peg? Were the support changes effective in improving struggling heroes like Zenyatta? Let’s extend that win rate chart out several days and see, starting with the supports:

Unmirrored win rate over time for all ranks in beta. Supports who received balance changes are highlighted.

For the most part, all supports who received a balance adjustment, including the bugfix to Mercy’s Valkyrie, saw an instant change in unmirrored win rate—except for Baptiste, who received changes in a later beta patch. This means that by and large, those changes had their intended effects! However, no hero was impacted as much as Zenyatta, who saw a win rate swing of roughly +5%. Historically, we have observed that changes to heroes’ health pools have had the most drastic effects on win rates, so this was not entirely unexpected. We will be monitoring Zenyatta’s newfound power (as well as his Snap Kicks) with great interest. Next, let’s look at the tanks:

Unmirrored win rate over time all ranks in beta. Tanks who received balance changes are highlighted.

Besides Roadhog, tanks received relatively minor balance adjustments, and the changes (or lack thereof) in their win rates reflected that. Roadhog and Wrecking Ball both enjoyed increases of 1-2% in their win rates as we tried to help those heroes better adjust to the new 5v5 environment. Finally, let’s look at damage heroes:

There’s no doubt that Soldier: 76 felt the heat from the “nerf bat” here, having dropped over 6% in win rate because of receiving three separate changes. Sojourn, on the other hand, climbed in win rate from a low of 42-43% to a respectable 44-45% win rate. Sombra’s adjustment was more of a realignment to account for the interaction between movement speed abilities and the new damage passive and did not result in much of a change in her win rate.

Gameplay Philosophy: Building a game with data and community collaboration

This process of change, discovery, analysis, and change is a never-ending cycle of balance. As popular strategies shift, new heroes might become the next Soldier and rise above the rest. With every new hero release or rework, we must be ready to make changes to ensure that they are neither too powerful nor too weak. When we make changes, we are constantly evaluating whether those changes were effective or need more punch behind them. If your favorite hero did not receive a change and you felt that they should have, we hope that the metrics introduced above help illuminate several of the inputs behind hero balance in Overwatch and give everyone faith in the process. Game balance is a marathon, not a sprint, and many more hero changes await us on the horizon. See you next beta!

214 Comments

Overwatch 2 PvP Beta Analysis: How Data and Community Feedback Inform Game Balance

Data and Design: Observing hero performance

Quantifying Interest: Metrics that help us evaluate hero popularity

Hero Balance: How data & feedback drive change

Stimulus and Response: Evaluating the results of our updates

Gameplay Philosophy: Building a game with data and community collaboration

Recent Articles

Director’s Take: Addressing Unapproved Peripherals on Consoles

Ready to Make History - A Look into Venture's Development

Choose Your Mythic Skin with the New Mythic Shop

Drill Into the Competition in Overwatch 2 – Season 10

Defense Matrix Update: Season 10 and Beyond

Director’s Take: Welcoming Venture to Overwatch