How to Catch ‘Em All: Alumni Spotlight on Yina Gu

Yina was a Fellow in our Winter 2017 cohort who landed a job with one of our hiring partners, Opera Solutions.

Tell us about your background. How did it set you up to be a great Data Scientist?

I received my PhD degree from The Ohio State University majoring computational chemistry. For my PhD research, I developed multiple predictive models and published web servers to solve various biophysics problems using machine learning and statistical methods in Python, R and Matlab. The data science skills and experiences I gained in my 5 years of PhD not only allow me to solve the fundamental scientific problems effectively and efficiently, but also enable my transition from academia to industry to solve the real-world challenges.

What do you think you got out of The Data Incubator?

The 8-weeks intensive training at The Data Incubator really helped me to go deeper into data science field and get fully prepared for the essential skills to work in a big data industry with the cutting-edge analytics techniques, including programming, machine learning, data visualization as well as business mindset. Last but not least, I believe the networking with other very talented fellows are the most valuable thing I got out of TDI!

What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?

Learning and advancing your programming and data mining skills in Python and SQL. Be familiar with statistics and basic machine learning methods. There are many useful online resources to start with. After feeling comfort with those tools, look for interesting dataset to play with and prepare for a capstone project to solve a data-driven business problem.

Could you tell us about your Data Incubator Capstone project?

For my capstone project, I developed a multi-functional web app, PokéAttractor to help millions of Pokémon GO players to catch Pokémons as well as to provide business owner unprecedented insights about their customers. In the first part, I trained a k-Nearest Neighbor classifier on over 293,000 pokemon sightings dataset from Kaggle. My app predicts the probabilities for up to 10 most possible Pokémons appear at a given location and time. In the second part, I scrapped over 275,000 tweets and analyzed the social media impact of Pokémon GO using machine learning, natural language processing, time series forecasting, and geographic data analytics techniques.

How did you come up with the idea for the project?

Pokémon GO is officially the biggest mobile game in the US history. I am one of millions of Pokémon trainers who walk around daily but just wonder how to catch’em all in a more effective and efficiently way. Meanwhile, I realize that businesses owners also wonder how to capitalize this massive opportunity and drive huge amounts of foot traffic and conversions. I came up the idea to build a multi-functional web app to help both players to catch Pokémon and business owner to attract fans.

What technologies did you use and what skills did you learn at TDI that you applied to the project?

I learned and applied a lot of useful tools and techniques at TDI to my capstone project, which include web-scraping, SQL database, Pandas, machine learning in Scikit-Learn (k-Nearest Neighbor classification, k-means clustering), NLP with NLTK library, timeseries analysis, data visualization in Javascript D3, Folium and Bokeh, web development with Bootstrap, CSS, Flask and Heroku

Learn more about our offerings:

Related Blog Posts

Moving From Mechanical Engineering to Data Science

Moving From Mechanical Engineering to Data Science

Mechanical engineering and data science may appear vastly different on the surface. Mechanical engineers create physical machines, while data scientists deal with abstract concepts like algorithms and machine learning. Nonetheless, transitioning from mechanical engineering to data science is a feasible path, as explained in this blog.

Read More »
Data Engineering Project

What Does a Data Engineering Project Look Like?

It’s time to talk about the different data engineering projects you might work on as you enter the exciting world of data. You can add these projects to your portfolio and show the best ones to future employers. Remember, the world’s most successful engineers all started where you are now.

Read More »
open ai

AI Prompt Examples for Data Scientists to Use in 2023

Artificial intelligence (AI) isn’t going to steal your data scientist job! Instead, AI tools like ChatGPT can automate some of the more mundane tasks in your future career, saving you time and energy. To make life easier, here are some data science prompts to get you started.

Read More »