Up Next

Brian Stauffer

Can artificial intelligence help us understand racial bias in sports?

Computer scientists apply artificial intelligence and ‘big data’ analytics to demonstrate bias in how sports commentators discuss athletes

The 2019 NFL season quickly evolved into the Lamar Jackson show, every week delivering a different story, usually involving a highlight touchdown, a gaudy stat line, or a charming news conference.

One story, however, was different: following a San Francisco 49ers loss at the hands of Jackson’s Baltimore Ravens on Dec. 1, Tim Ryan, the radio color analyst for the 49ers, suggested that Jackson was successful in part because his dark skin helped him disguise a dark football. The public backlash was swift and loud, even if the fallout was mild (Ryan was suspended for one game).

Instead of an honest conversation about why we talk about certain athletes using racialized language, the sports world settled for an apology and the next news story in the cycle.

It is society’s inability to adequately address issues of race and bias that motivated Mohit Iyyer, an assistant professor of computer science at the University of Massachusetts Amherst, to apply artificial intelligence and “big data” analytics toward answering a central question: Do sports commentators demonstrate bias in how they discuss athletes from different racial backgrounds?

Iyyer’s results indicate that broadcasters focus on natural ability when discussing nonwhite players. This is biased commentating, which can reinforce an old sports trope: Black athletes owe their success to “God-given” talent rather than hard work.

He was hardly the first person to ask these questions, or come to these types of conclusions. But the method Iyyer used to reveal the findings to confirm the things that we believe might be true is a new way that we can use artificial intelligence.

For decades, social scientists have speculated that the sports world is an important place to ask questions about race, perception and society. In 2019, Iyyer, along with colleagues Jack Merullo, Luke Yeh and others, published a peer-reviewed study at the annual Empirical Methods in Natural Language Processing conference that examined the presence of bias in sports commentary.

Why we use positive and negative words to describe people is a fundamental question of the science of linguistics, which until recently, had little to do with computers. This has changed in the last few decades, with the advent of natural language processing, as summarized by Alvin Grissom II, a computer science professor at Ursinus College who collaborates with Iyyer.

“Natural language processing consists of algorithmic approaches to dealing with human language,” Grissom said. “We see this everywhere in modern society — in machine translation services like Google Translate, in automated call centers, in search engines that process human queries, in conversational agents like Siri and Cortana, and so on.”

Iyyer and his colleagues embarked on an ambitious sports data-mining exercise.

Using YouTube videos, they collected and processed transcripts from 1,455 game broadcasts from the NFL and NCAA football games recorded between 1960 and 2019. After collecting these data, they linked mentions of players within these transcripts to information about their race. If they heard “Steve McNair” or “McNair” on a transcript, they assigned that player a racial identifier (“white” or “nonwhite”). They crowdsourced racial identification so that no single person decided who was white or nonwhite. Most nonwhite players in the NFL are of African descent, and so the nonwhite category is mostly a stand-in for black players.

When these computer science perspectives were applied to the data set containing broadcast transcripts, several patterns emerged, some of which highlight an interaction between word choice, a player’s race, and the player’s position on the field.

Announcers refer to nonwhite players by their first name more often than white players and especially, at the quarterback position (18.1% for nonwhite vs. 8.3% for white). This pattern doesn’t hold across positions, where the disparity in first-name reference is either much smaller (wide receivers: 11.3% nonwhite vs. 6.9% white) or nonexistent (running back: 8.5% nonwhite vs. 10.5% white. Tight end: 13.8% nonwhite vs. 16.6% white).

There isn’t a right or wrong way to identify nonwhite and white athletes, but there should be a consistency when commentators discuss them. What Iyyer’s research shows is that broadcasters are using different standards, especially when discussing black quarterbacks. The announcers, for example, talk about black quarterbacks as if they are playing a different position, a bias that could be unconscious.

Iyyer and colleagues then used natural language processing algorithms to conduct a sentiment analysis, which is a tool in natural language processing that uses the types of words (e.g., “positive” words, such as smart, charming, beautiful) to understand the overall sentiment in a description of an idea or object. For example: Think of your least-favorite athlete. Natural language processing algorithms can identify the negative words (e.g., horrible, soft, crybaby) that are most associated with that athlete when he is mentioned in the press. This can give a picture of the overall sentiment surrounding that athlete.

Iyyer and colleagues performed sentiment analysis, but used the broadcaster transcripts, and focused on positive and negative words that were associated with white and nonwhite football players.

  • Positive words, white quarterbacks: cool, smart, favorite, safe, spectacular, excellent, class, fantastic, good, interesting.
  • Positive words, nonwhite quarterbacks: ability, athletic, brilliant, awareness, quiet, highest, speed, wow, excited, wonderful.

The apparent overemphasis on athletic gifts and “natural” ability appears to hold for other positions:

  • Positive words, white players, across positions: enjoying, favorite, calm, appreciate, loving, miracle, spectacular, perfect, cool, smart.
  • Positive words, nonwhite players, across positions: speed, gift, versatile, gifted, playmaker, natural, monster, wow, beast, athletic.

This task of analyzing sports broadcasts was challenging.

In total, the “FOOTBALL” data set contains 267,778 mentions of 4,668 unique players and almost two-thirds are nonwhite. From the data set, basic properties of sports broadcasting to race can be examined. For example, Iyyer and colleagues found that mentions that they identified that were associated with nonwhite quarterbacks increased dramatically between 1970 and 2010, which is consistent with the overall growth in the number of nonwhite (mostly black) quarterbacks in college and professional football during this time. Iyyer noted that the data contained a set of limitations.

“While our database contains over 1,000 games across many decades, we initially imagined building a data set that included a broadcast transcript of every single televised NFL/NCAA game ever played,” Iyyer said. “Copyright restrictions prevented us from accessing all of this data, which is why we resorted to games posted by YouTube users. Currently, our data set is skewed toward more recent [post-2000] games, which prevents us from properly doing this analysis.”

Results like these suggest that the broadcaster transcript data set can tell real stories about the changing world of sports.

The use of artificial intelligence, algorithms and “big data” analytics are now common in sports. Collegiate and professional teams implement methods in new and provocative ways, mostly to improve performance and safety. The wider use of these methods in sports has also increased the scope of areas that one can address, no longer relegated to questions about who a team should draft or how to set a starting lineup. These methods can now address “off-the-field” questions: How efficiently a team is spending money, how social media has changed fandom, or the manner that athletes are discussed and written about in the press.

Lucia Trimbur, an associate professor of sociology at John Jay College and the author of Come Out Swinging: The Changing World of Boxing in Gleason’s Gym, suggests that sport has always housed racial bias:

“Like other large social institutions, sport remains a site of insidious racism and anti-black/anti-brown bias. Today we see anti-black racism manifest most explicitly in the sports black athletes are encouraged to join and coach, the positions they play within those sports, the compensation they earn, and how their performance is discussed. In the case of performance, black players overwhelmingly are applauded for their physical prowess rather than their intellect, a deeply racialized dualism that white players do not confront.”

Race and racism will appear in any arena where there is a competition and stakes: politics, education, art, entertainment and others. And since sports is one of the ultimate forms of entertainment, we should expect no different. Scholars of the past have examined questions similar to the ones that Iyyer and colleagues recently examined with artificial intelligence, with similar results.

It is important to note that algorithms can be biased, which should lead us to be careful in interpreting the results of any algorithm, even those that aim to do “good” (such as the ones developed by Iyyer and colleagues). Sarah Brown is a postdoctoral research associate in the Data Science Initiative at Brown University whose research aims to integrate fairness and equity into algorithms. She cautions:

“The algorithms can only learn from people. They are taking in data, which is history, and trying to make predictions about the future.”

Brown, an expert in computer science and algorithms, shares this opinion with Trimbur, who also believes that we should proceed with caution when interpreting the application of advanced metrics in sports:

“I think we can learn from and do a great deal with analytics and algorithms. At the same time, we must be careful with and transparent about our use of these tools. In the case of algorithms that make decisions, we must take great care that we aren’t reproducing the very racial biases we are trying to change.”

While skepticism from Brown and Trimbur is well-founded, both see the merits of Iyyer’s work. Brown is encouraged by Iyyer’s findings:

“Some of us already feel like the sports world is like this, so using technologies like natural language processing to produce concrete evidence can be powerful. This quantitative evidence allows us to monitor the bias in greater detail, for example, noting how it changes, or doesn’t change, over time or across different broadcasters or networks.”

Trimbur suggests that Iyyer’s work is “an extremely important intervention that helps us see precisely how racial bias works in football commentating.”

She emphasizes that future efforts should integrate algorithmic approaches with other perspectives on studying race:

“Combined with historical and qualitative insights, Iyyer’s findings have the potential to do real anti-racist work by alerting people in the world of sport about the discourses we use and the anti-racist alternatives we should consider instead.”

As with any racial exposé, Iyyer’s work begs us to ask about next steps — what do we do with these findings? But Iyyer never claimed to have all the answers, nor should he feel compelled to offer them. He is, however, encouraged about the future of artificial intelligence approaches to studying race, and sees his findings as the beginning, and not the end, of a conversation:

“The overall goal with this kind of research is to raise awareness about racial bias in sports by providing quantitative evidence of its nature and existence, with the hope of eventually bringing about systematic change in how commentators describe players of different races.”

Will these methods help the problem of racial bias in sports commentating and potentially, other areas in sports? Iyyer balks at the notion that any single intervention can solve anything, but says that “in conversations or debates about how important these issues are, it’s often helpful to point to concrete results from data analyses in addition to anecdotes and opinions just as sports fans do when debating who’s the best team or player.”

Maybe Iyyer is right. We can only imagine a world where conversations about racial bias in sports were only as messy as debates about the NFL MVP, where arguments for Jackson are supported by the numbers. While numbers alone rarely settle any sports debate, that we have them to argue for Jackson makes the MVP debate cleaner than most modern conversations about discrimination in sports.


C. Brandon Ogbunu, a New York City native, is a computational biologist at Brown University. His popular writing takes place at the intersection between sports, data science, and culture.