Professional sports organizations rely on analytics to provide nearly real time data regarding players’ on-field performances. From how fast a player runs, to the football’s speed, the intensity of a collision, advanced run-pass tendencies, safety rotation, play similarity, line strength and more - this data is full of important metrics. The faster teams receive the data, the faster they can make adjustments to improve and optimize the game play.
For one professional sports analytics firm, it was taking four and a half days — far too long for the real time analysis and reporting needed — to process the data.
The challenge for Six Feet Up’s team of expert developers:
The discovery phase of this project required an in-depth analysis of the firm’s existing monolithic process which included reviewing the code and technology stack to ensure the appropriate tooling was in place. In this situation, the pipeline included Python and several popular machine learning libraries, but relied heavily on NoSQL and Amazon Relational Database Service (RDS) accessed with long running EC2 instances as opposed to AWS’ Lambda service.
After the discovery phase, Six Feet Up’s expert team of developers hit the ground running to execute an optimization strategy that would leverage the physical server aspect of serverless computing. Specifically, the team:
By parallelizing and componentizing the process behind the pipeline, Six Feet Up produced results faster than the systems development life cycle (SDLC), which allows for quicker reconfigurations and optimizations to the process. The cost to keep multiple machines busy with no room for system errors versus parallelizing the process is negligible, but the time and effort saved account for considerable cost savings.
In three short weeks, Six Feet Up brought the professional sports analytics firm’s machine learning pipeline’s run time down from four and a half days of waiting and hoping components wouldn’t fail to just under 90 seconds — faster than their coffee could even finish brewing. Today, the pipeline continues to be used as the firm’s primary data source for validating major professional football game models against each other.