Data Science Project Ideas

We love data science and cool data science projects.  If you’re a applying for our free data science fellowship and looking to propose a data science project, here are four project ideas.


GitHub is a great source of data on how engineers write code.  A recent post found discrimination against Pull Requests submitted by women on GitHub, although perhaps that study could have been better.  But there are lots of other ideas to pursue.  We can easily learn an n-gram classifier on whether a line of code is a comment or not and search for commented out code.  Are repos by academics more likely to have commented out code?  Are they more likely to violate lint rules?  Additionally, it would be interesting to analyze commits that are in response to bug fixes to predict in which lines of code bugs are more likely to occur.

Open Food

Ever what makes Mexican food unique or what’s distinctive about Polish cuisine?  There are plenty of recipe websites (,, with ingredient lists.  You could easily run PCA, K-Means,  or your favorite clustering algorithm or a classifier on ethnically identified dishes.  Can you combine this information to find an “Eastern European” Ingredients” eigenvector?

Open Drinks

If you’re interested in cocktails,’s ingredient lists are event hyperlinks and cross-referenced for you.  You could easily use SVD or other recommendation engine techniques to find cocktails that are similar to the ones you already drink.  Cocktails are suppose to have a balance of the five basic tastes.  Drink Mixer actually gives you the nutritional information to break information down.  Connoisseurs of beer know that has very in depth beer reviews, often containing thousands of reviews per beer.  You can use NLP to find similar beers?

NYC Taxi Data:

There’s plenty of analysis of NYC Taxi Data but it’s often about optimizing fares or finding which street to hail a taxi on.  But there’s a tonne of sociological data to be unlocked.  Where do the Bridge and Tunnel Crowd go on Friday or Saturday Night?  Where do theatre or symphony goers head home to after their performances?  Where the bankers go to eat or sleep after work?  Where do the consultants, who fly every Sunday evening and arrive back in town Thursday evening live?  What are the most popular hotels amongst Amtrak travelers?  What about flights?  Where do tourists go after hitting up the MET museum or Statue of Liberty?  Can you companies understand where their customers live?

Related Blog Posts

Moving From Mechanical Engineering to Data Science

Moving From Mechanical Engineering to Data Science

Mechanical engineering and data science may appear vastly different on the surface. Mechanical engineers create physical machines, while data scientists deal with abstract concepts like algorithms and machine learning. Nonetheless, transitioning from mechanical engineering to data science is a feasible path, as explained in this blog.

Read More »
Data Engineering Project

What Does a Data Engineering Project Look Like?

It’s time to talk about the different data engineering projects you might work on as you enter the exciting world of data. You can add these projects to your portfolio and show the best ones to future employers. Remember, the world’s most successful engineers all started where you are now.

Read More »
open ai

AI Prompt Examples for Data Scientists to Use in 2023

Artificial intelligence (AI) isn’t going to steal your data scientist job! Instead, AI tools like ChatGPT can automate some of the more mundane tasks in your future career, saving you time and energy. To make life easier, here are some data science prompts to get you started.

Read More »