Why AI Is a Slam Dunk for the NBA
Thanks to the advent of player-tracking data in the NBA and the use of machine learning software running on powerful servers, we’re on the cusp of having some fouls called automatically in professional basketball. But that is just the beginning of what AI can do in the NBA, according to Dwight Lutz, the senior director of basketball strategy and analytics for the Atlanta Hawks.
In a virtual talk presented by The Society of HPC Professionals on Friday, Lutz says we’re very close to having an AI referee that can call one specific foul: a defensive three-second (D3S) violation. Unless a defensive player is guarding an offensive player, or attempting a rebound, he is not allowed to be in the lane for more than three seconds, which is a rule the NBA instituted in 2001 to speed up game play and bolster offensive excitement.
You can thank the maturation of the NBA’s player tracking system, which was first implemented for the 2013-2014 season, for the advent of AI refs in the NBA. During that year, the NBA contracted with an outside entity to install and run an array of cameras in the rafters of all 29 NBA arenas. The cameras track the movement of all players (and the ball) at 25 frames per second. A computer vision algorithm then turns those images into a time-series set of X-Y coordinate data for individual players that can be fed into analytic systems or used as the basis for statistical modeling.
Lutz, who previously worked as a data scientist for the NBA, provided SHPCP attendees with a demo of the type of data science that went into the D3S calculations. The first step is to write a rules-based algorithm to obtain video for all instances when a player was in the lane for more than three seconds. Humans then watch those videos and label them as either being violations or not. They also added instances where D3S was actually called by the referees (which, according to some observers, doesn’t happen nearly enough, especially in the second half of games), and cases where the defensive player was going for a rebound.
Because the NBA’s player tracking data determines only the center of a player’s mass, there needed to be some offsets for the size and “length” of a player (i.e. his wingspan), to account for instances where a very large player is guarding another player from a distance (possibly from inside the lane). Once these tweaks and others are made, the data can be loaded into a probabilistic model, which can determine whether a player has committed D3S with a great degree of accuracy, according to Lutz.
“D3S is something in the works that’s going to happen,” said Lutz. “This will actually take place eventually.”
Will other fouls follow? The jury is still out on that one, Lutz says. “What people are working on, which actually requires a much higher resolution cameras,” he says, is figuring out where a player’s hands are, where their feet are, and that will actually allow us to create a simulation, kind of like how you see in a video game, which would then allow you to be able to figure out if player fouled another player to some accuracy.”
While automatically calling blocking and reaching fouls may be a way off still, the technology is nearly good enough to automate the calling of more “discrete” violations, such as stepping out of bounds, he said.
But there is a lot more that can be done with this data outside of refereeing the game. “We’re just actually touching the tip of the iceberg in what we can do with the data,” Lutz said. “We’re just now starting to figure out…which ones matter.”
Some of the most interesting examples of basketball analytics that Lutz’s group is undertaking includes clustering analysis, whereby players are grouped into clusters according to their characteristics. This data can help inform personnel decisions, which are critical in the NBA.
For example, where does a player tend to shoot from? Do they create their own shot, or do they mostly let others create the shot? How often does the player dribble? Do they like to drive to the basket, post up, or run the pick and roll?
According to Lutz, the answers to these questions can be gleaned from careful analysis of the player-tracking data combined with other data, including the scorekeeper’s play-by-play (PBP) data. With clustering algorithms like K-Means and Gaussian Mixture Models, data scientists can create groups that describe the type of player.
“Once we have these clusters, we then determine which types of cluster play well together,” Lutz said. “When two different clusters are on the court together, does that go well, does that go poorly, etc.”
Cluster analysis can help inform the front office which players would be good matches with the current team. For example, when LeBron James left the Cleveland Cavaliers to join Dwyane Wade with the Miami Heat for the 2010-2011 season, many basketball observers feared that would be the end of competition in the league.
“That didn’t happen,” Lutz says. “[Our] model would have actually predicted that these two players would not fit well together. They’re both perimeter creators with high possession rates, so they both had the ball a lot, initiating the offense. And they were both poor shooters.”
There were similar concern when Kevin Durant left the Oklahoma City Thunder to join the Golden State Warriors for the start of the 2016-2017 season. How would Durant fit alongside Stephen Curry, who led the Warriors to the best regular season record in NBA history the previous year?
Those fears were unfounded, according to Lutz’s model. “He was going to have a very easy transition joining the Warriors because, unlike LeBron James and Dwyane Wade, he’s a great off-ball player. And all of the players basically involved with the Warriors were great shooters, which allows the spacing of the game to be very efficient…These are the kind of player-fit related questions that we can answer with this analysis.”
This type of analysis take a fair amount of computational horsepower, but that’s not a barrier anymore, Lutz said. “Computing power has made some of this stuff just really, really doable, whereas in the past it really wasn’t,” he said.
All NBA teams have access to all the player-tracking data, so it’s up to the individual teams’ data science groups to do something useful with it. Lutz says the Atlanta Hawks do their best to get more granular player data from college and high school teams, although player tracking data is unavailable.
It’s unclear what the net impact of the data is having on the Hawks’ success. But if one thing is clear, it’s that analytics and AI have become critical to the team’s operations.
“We’ve gotten to the point where this data is so ingrained with what we do as an organization on the basketball side,” Lutz said. “Something I haven’t necessarily done is show directly the impact of the data on our models, which is massive. At this point I’m taking it for granted we’re much more accurate in our predictions now than we were 10 years ago.”
Related Items:
When Citizen Data Science Meets Basketball Analytics
Deep Learning Is About to Revolutionize Sports Analytics. Here’s How
Today’s Baseball Analytics Make Moneyball Look Like Child’s Play