How to Catch ‘Em All: Alumni Spotlight on Yina Gu

Yina was a Fellow in our Winter 2017 cohort who landed a job with one of our hiring partners, Opera Solutions.

Tell us about your background. How did it set you up to be a great Data Scientist?

I received my PhD degree from The Ohio State University majoring computational chemistry. For my PhD research, I developed multiple predictive models and published web servers to solve various biophysics problems using machine learning and statistical methods in Python, R and Matlab. The data science skills and experiences I gained in my 5 years of PhD not only allow me to solve the fundamental scientific problems effectively and efficiently, but also enable my transition from academia to industry to solve the real-world challenges.

What do you think you got out of The Data Incubator?

The 8-weeks intensive training at The Data Incubator really helped me to go deeper into data science field and get fully prepared for the essential skills to work in a big data industry with the cutting-edge analytics techniques, including programming, machine learning, data visualization as well as business mindset. Last but not least, I believe the networking with other very talented fellows are the most valuable thing I got out of TDI!

What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?

Learning and advancing your programming and data mining skills in Python and SQL. Be familiar with statistics and basic machine learning methods. There are many useful online resources to start with. After feeling comfort with those tools, look for interesting dataset to play with and prepare for a capstone project to solve a data-driven business problem.

Could you tell us about your Data Incubator Capstone project?

For my capstone project, I developed a multi-functional web app, PokéAttractor to help millions of Pokémon GO players to catch Pokémons as well as to provide business owner unprecedented insights about their customers. In the first part, I trained a k-Nearest Neighbor classifier on over 293,000 pokemon sightings dataset from Kaggle. My app predicts the probabilities for up to 10 most possible Pokémons appear at a given location and time. In the second part, I scrapped over 275,000 tweets and analyzed the social media impact of Pokémon GO using machine learning, natural language processing, time series forecasting, and geographic data analytics techniques.

How did you come up with the idea for the project?

Pokémon GO is officially the biggest mobile game in the US history. I am one of millions of Pokémon trainers who walk around daily but just wonder how to catch’em all in a more effective and efficiently way. Meanwhile, I realize that businesses owners also wonder how to capitalize this massive opportunity and drive huge amounts of foot traffic and conversions. I came up the idea to build a multi-functional web app to help both players to catch Pokémon and business owner to attract fans.

What technologies did you use and what skills did you learn at TDI that you applied to the project?

I learned and applied a lot of useful tools and techniques at TDI to my capstone project, which include web-scraping, SQL database, Pandas, machine learning in Scikit-Learn (k-Nearest Neighbor classification, k-means clustering), NLP with NLTK library, timeseries analysis, data visualization in Javascript D3, Folium and Bokeh, web development with Bootstrap, CSS, Flask and Heroku

Learn more about our offerings:

Related Blog Posts

data science portfolio

How to Build a Strong Data Science Portfolio: 5-Step Guide

So you want to be a data scientist? Great choice! Data scientists are still the hottest jobs around. But before you can start applying for data science jobs, you need to build a strong data science portfolio. A data science portfolio is a collection of your best data science projects that demonstrate your skills and abilities.

In this blog post, I’ll provide a 5-step guide on how to build a strong data science portfolio.

Read More »
imposter syndrome

Impostor Syndrome in Tech: What It Is, Why It Exists, and How to Overcome It

Impostor syndrome isn’t experienced in just certain industries or disciplines or only by certain individuals. It’s much more widespread than you may think. If you’re in the technology field, you may be familiar with this sentiment, but maybe you’ve never heard the term impostor syndrome. So, what exactly is impostor syndrome? What causes it? And how do people in data science, the tech field or STEM industries overcome it?

Read More »