Data Sources for Cool Data Science Projects: Part 5

computer-1185626_960_720Links to Part 1Part 2Part 3, Part 4

At The Data Incubator, we run a free eight week data science fellowship to help our Fellows land industry jobs. Our hiring partners love considering Fellows who don’t mind getting their hands dirty with data.  That’s why our Fellows work on cool capstone projects that showcase those skills.  One of the biggest obstacles to successful projects has been getting access to interesting data.  Here are some more cool public data sources you can use for your next project:

Environmental Data

  1. Climate change: Climate data is hot right now, and US Climate Data is a good starting point for a lot of great, up to date datasets. You can find more detailed sets from NOAA, and climate change sets from the US gov.
  2. Sea Ice: University of Colorado’s National Snow and Ice Data Center publishes the Sea Ice Index, which records ice coverage in the Antarctic and Arctic Oceans. Their datasets include daily and monthly measures from 1978 to now!
  3. Forest Coverage: The World Bank maintains data on forest coverage per country and across the globe. Fun fact: Over 98% of land area in Suriname was forest in 2015.


  1. Education Fulfillment: Researchers at Wittgenstein Centre for Demography and Global Human Capital based in Vienna have compiled a dataset of chronicled and projected education levels for over 150 countries dating back to 1970 and projecting to 2060. You can download the complete dataset through the Wittgenstein Centre Data Explorer.
  2. Student Loans: The US Department of Education publishes the default rates for student loans assembled by school, school type, and state. They recently published data compiling students with loans due for repayment in 2013.

International Data

  1. New Zealand National Statistics: New Zealand has a rather impressive national statistics website. The small nation publishes data on everything from businesses, abortion, to the Māori census.
  2. International Financial History: The Jordà-Schularick-Taylor Macrohistory Database contains data for 17 “advanced” economies dating back to 1870 updated on an annual basis. They claim to be the “most extensive long run macro-financial dataset to date.


While building your own project cannot replicate the experience of fellowship at The Data Incubator (our Fellows get amazing access to hiring managers and access to nonpublic data sources) we hope this will get you excited about working in data science.  And when you are ready, you can apply to be a Fellow!

Got any more data sources?  Let us know and we’ll add them to the list!



Related Blog Posts

Moving From Mechanical Engineering to Data Science

Moving From Mechanical Engineering to Data Science

Mechanical engineering and data science may appear vastly different on the surface. Mechanical engineers create physical machines, while data scientists deal with abstract concepts like algorithms and machine learning. Nonetheless, transitioning from mechanical engineering to data science is a feasible path, as explained in this blog.

Read More »
Data Engineering Project

What Does a Data Engineering Project Look Like?

It’s time to talk about the different data engineering projects you might work on as you enter the exciting world of data. You can add these projects to your portfolio and show the best ones to future employers. Remember, the world’s most successful engineers all started where you are now.

Read More »
open ai

AI Prompt Examples for Data Scientists to Use in 2023

Artificial intelligence (AI) isn’t going to steal your data scientist job! Instead, AI tools like ChatGPT can automate some of the more mundane tasks in your future career, saving you time and energy. To make life easier, here are some data science prompts to get you started.

Read More »