Big Data Via Connected Cars Hits the Open Road
Data engineers at Purdue University are using a wealth of connected vehicle data to help improve highway safety and efficiency while laying the groundwork for the ultimate edge device, the autonomous vehicle.
In an effort to put “data in the driver’s seat,” researchers at Purdue’s College of Engineering have created an Autonomous and Connected Systems Initiative that seeks to advance the Internet of Things, robotics and autonomy applications. The effort is being supplemented by a new graduate-level course on the application of machine learning to autonomous vehicles.
The foundation of the data initiative is the estimated 12 billion connected-vehicle data points collected across the state of Indiana in a month. “It is big data,” noted Darcy Bullock, a civil engineering professor and director of Purdue’s Joint Transportation Research Program
Among the goals of the Purdue effort is using its data engineering initiative to forge collaboration between state agencies building new highway infrastructure and auto makers producing connected vehicles and, eventually, autonomous vehicles. The huge data sets provide the ingredients for transportation planning as the connected vehicle evolves into an autonomous people mover.
The Hoosier State spends about $2 billion annually on highway infrastructure, trailing only Michigan in terms of automotive GDP. Until now, state highway departments and car makers “have never really talked,” Bullock said in an interview. “Both really need each other now.”
The common denominator are data recorded in connected vehicles that can be used for everything from real-time traffic updates to gauging the condition of pavement and lane markers.
“From a civil engineering perspective, we need to know from the auto manufacturers what we need to do to build the next generation of roads,” Bullock said.
Hence, Purdue’s connected car initiative is attempting to organize data collected by car makers to help transportation planners keep traffic moving while preparing for the day when autonomous vehicles become a reality.
The initial focus on connected vehicles is driven by the growing amounts of data collected by “black box” systems that record data on everything from speed to hard-braking. (Such a system was used to determine that excessive speed on a winding round that contributed to golfer Tiger Woods’ February crash near Los Angeles.)
Data engineers have come to view connected vehicles as the ultimate edge device, generating loads of data about traffic patterns and hazards that could be used to inform transportation planning.
Purdue researchers focused on connected vehicles and gradual autonomy are currently using large sets of anonymized data for testing in controlled environments, including the university’s research into unmanned vehicles and autonomous agriculture. In one use case, machine learning algorithms were used to program drones that mapped car crash scenes.
The edge computing challenge focuses on prioritizing connected data, Bullock said. Data analysts must decide “what’s important in real time, and what’s information we can process at the edge in the car and maybe transmit it at 2 a.m.,” he explained.
Connected car data like pavement conditions can perhaps be sent once a day. “What we really need to know is what’s going on out there on the interstate at any given time because those are conditions where we can make tactical decisions,” he added.
(Bullock used a traffic dashboard visualization to show us precisely where and when he was stuck in a U.S. Interstate 65 traffic jam on his way to the Purdue campus to be interviewed for this story.)
Purdue’s efforts are being supplemented by curriculum changes designed to train the next generation of data engineers. The university’s data science initiative, for instance, focuses on data applications and “fluency.”
Along with auto maker, Purdue is also working with Google, using its BigQuery platform to accelerate analysis of connected and autonomous vehicle data sets.
Bullock sees more opportunities to scale Purdue’s crowd-sourcing model. “When one considers that most modern cars have a large collection of sensors that can provide this feedback, we must find ways to effectively and quickly share data between manufacturers and agencies in a manner that does not compromise privacy,” he told U.S. lawmakers considering transportation infrastructure legislation.
Recent items:
2021 Predictions from the Edge and IoT
Hitching a Ride to the Edge with Akamai
Openness a Big Advantage as Edge Grows, IBM Says