Data Sources for Cool Data Science Projects Part 6

startup-593324_960_720Links to Part 1Part 2Part 3Part 4, Part 5

At The Data Incubator, we run a free eight week data science fellowship to help our Fellows land industry jobs. Our hiring partners love considering Fellows who don’t mind getting their hands dirty with data.  That’s why our Fellows work on cool capstone projects that showcase those skills.  One of the biggest obstacles to successful projects has been getting access to interesting data.  Here are a few cool public data sources you can use for your next project:

Government/Politics 

  1. Presidential Newspaper Endorsements: Noah Veltman has published a lot of cool data projects, one of them being all presidential endorsements of over 100 newspapers from 1980 till now. You can see it as a formatted table or spreadsheet.  
  2. Medicare Beneficiaries: There are more than 55 Americans covered by Medicare and the Medicare Health Outcomes Survey measures the ‘physical and mental health and well-being’ of beneficiaries for a 2 year period. The data set covers recipients from 1998-2014.
  3. American Manufacturing: The Census Bureau publishes the Annual Survey of Manufactures (ASM). This is a state and industry level data set for America’s manufacturing sector.

Travel

  1. Take Flight: OpenFlights.org has compiled data on over 60,000 flight routes and almost 1,000 iteneraries from the world’s busiest airport, Atlanta Hartsfield-Jackson International Airport. Each route includes the airline, departing airport, arriving airport, stops, and the type of plane.
  2. TSA Confiscation: Max Gaika, a data and FOIA guru built an interactive map of TSA confiscations based on data collected from the government. In this set, there are a total of 22,044 “dangerous items”  

Police/Crime 

  1. State Prison Admissions: The New York Times has gathered data assembling the quantity of inmates sent to state prison by county in 2006, 2013, 2014. The numbers were taken from the National Corrections Reporting Program which is restricted to the public, but accessible for select reporters.
  2. NYC Police Complaints: New York City now publishes official complaints against city police from every closed case since 2006. There are over 200,000 complaints all of which include location and presence of video evidence, but no information about the officer involved.

 

While building your own project cannot replicate the experience of fellowship at The Data Incubator (our Fellows get amazing access to hiring managers and access to nonpublic data sources) we hope this will get you excited about working in data science.  And when you are ready, you can apply to be a Fellow!

Got any more data sources?  Let us know and we’ll add them to the list!

Related Blog Posts

Moving From Mechanical Engineering to Data Science

Moving From Mechanical Engineering to Data Science

Mechanical engineering and data science may appear vastly different on the surface. Mechanical engineers create physical machines, while data scientists deal with abstract concepts like algorithms and machine learning. Nonetheless, transitioning from mechanical engineering to data science is a feasible path, as explained in this blog.

Read More »
Data Engineering Project

What Does a Data Engineering Project Look Like?

It’s time to talk about the different data engineering projects you might work on as you enter the exciting world of data. You can add these projects to your portfolio and show the best ones to future employers. Remember, the world’s most successful engineers all started where you are now.

Read More »
open ai

AI Prompt Examples for Data Scientists to Use in 2023

Artificial intelligence (AI) isn’t going to steal your data scientist job! Instead, AI tools like ChatGPT can automate some of the more mundane tasks in your future career, saving you time and energy. To make life easier, here are some data science prompts to get you started.

Read More »