Student team places second at first GfK NextGen Data Science Hackathon contest

Four students design smart speaker using data

Written By Dara Collins, Editor-Elect

A team of four Point Park students placed second in the inaugural GfK NextGen Data Science Hackathon Competition.

GfK, short for the company’s tagline “Growth from Knowledge,” is a global market research company that connects data and science to answer questions from businesses concerning consumers, markets, brands and media in the present and the future.

The student team of Tanner Campbell, Alexa Lake, Edwin Obuya and Sabrina Tatalias were given 10 days to analyze raw data given to each competing team and design a smart speaker that would generate the most profit.

“We were supposed to get this data and make a business plan for a smart speaker,” Campbell, a senior Information Technology (IT) student said. “Who’s your target market? What price are you going to sell it for? What shape is it going to look like? What features is it going to have? We had to do all of that, and we determined it based on our algorithm we wrote to guess the sales outcome of each said feature. So we wrote an algorithm that guessed each said feature and whichever was the most profitable one, that’s what we went with.”

The team was given two sets of data in the form of Excel spreadsheets. One data set included information on smart speakers sold in the last three years. The information included features of different speakers including dimensions, Bluetooth, ethernet and price, according to Campbell.

“Market research is always tricky because if you use historical data, that may not be a good picture of what’s going to happen in the future because trends tend to change,” Assistant Professor Dr. Mark Voortman said.

The second data set was a smart speaker survey, and Voortman, the student team advisor, said the students had trouble deciphering this data set.

“There were questions, answers and you could see which answers were the most popular, and all of these surveys were related to smart speakers […] to get an insight in what the demand would be for different types of speakers and features,” Voortman said.

The team analyzed the two data sets and created a neural network to create their smart speaker. A neural network is a set of algorithms, modeled after the human brain’s neural network, designed to recognize patterns and make predictions.

“They used to exist in the 80s, but then the interest died down because there was not enough computational resources, but lately they have been brought back for two reasons,” Voortman said. “There are more computational resources and also, people figured out…if they build layers of these neurons, the performance improved.”

An example of a neural network that those are familiar with is image recognition.

“How it works is it focuses on the lower level features of the image like corners,” Voortman said. “That would be like the first layer of the network would recognize corners or curved edges, things like that. The next layer could be a simple shape like a square, and the next layer could maybe recognize a face, and a face consists of multiple smaller parts. That’s the idea of a neural network.”

The team created this neural network to predict the revenue for a given speaker. The student team had the help of existing software to assist in the creation of the neural network.

“Nowadays you don’t build these things from scratch,” Voortman said “There is existing software that you can use. If you program, there is something called a library, which is a pre-existing piece of software. You can take a library, and you can use the library to do certain tasks. Nowadays, a lot of programming is just pulling in different libraries.”

With the software and the libraries, the students still had to prepare the data, according to Voortman.

“We convert the values into binary, which means like ones and zeros, so if it had a Google Home feature which you can talk to, that was a one,” Campbell said. “If it didn’t, that was a zero. We put all those inputs into an algorithm that based an output on what the sales would be for that year.”

At the conclusion of the competition on Jan. 28, the student teams presented their process and findings in five-minute presentations and fielded questions from a panel of judges for another five minutes. The teams were assessed on their data skills and business intelligence, as well as other criteria.

“[The judges] wanted to know why [the team] was making a recommendation and what data did they use,” Voortman said. “Everything you claimed had to come with evidence from the data.”

Campbell said the team created a PowerPoint to present to the panel through a virtual meeting, as the competition was remote. Following the presentation, the Point Park team was notified that they finished in second place behind three undergraduate students from City University of New York.

The Point Park team and Voortman were not aware how many teams participated in the competition. Voortman said it was not disclosed how many teams competed.

With the Hackathon in the past, the IT department is looking forward to the Data Jam later this month. Data Jam is an event that will explore and teach students about software used to analyze data like the student team did during the Hackathon.

Students will learn about Tableau, a data visualization software, and Weka, a machine learning tool to build predictive models during the two workshops on Feb. 20 and March 6, respectively, and will give a poster presentation on their data on April 3.

“Even if you don’t have a strong background, you will learn those skills,” Voortman said.