Ellen was a Fellow in our Spring 2016 cohort who landed a job with one of our hiring partners, Protenus.
Tell us about your background. How did it set you up to be a great Data Scientist?
The title data scientist implies both technical skills, as well as the ability to ask and answer questions scientifically. My academic decisions were driven by the desire to develop the second of these skills. I studied math in college because I was attracted to the logical way of thinking and proving concepts that I encountered in math class. I went on to do a PhD in neuroscience in part because I wanted to pursue a question to an extreme level of detail, leveraging the logic that I learned doing math. (I also wanted to understand how learning works at the level of neural ensembles, but that’s another story.)
As result of my academic trajectory I also learned to write analysis code. I enjoyed coding, and at first I considered it a perk of my particular field of neuroscience that a lot of coding (mostly in Matlab) was necessary for analyzing the large datasets I was collecting. However, I eventually came to appreciate coding in it’s own right and started taking steps to learn new languages and to improve my analyses by incorporating better tools. By this time I had realized that I would be happy doing coding full time, so The Data Incubator was a great segue way to new concepts and tools in the world of data science.
What do you think you got out of The Data Incubator?
What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?
Take every opportunity to broaden your technical skill set but remember to take to time to come up with good questions and use your scientific training. In the end, your technical skills will be boxes to check but the value you will add to a company will be in your unique perspective and ability to think critically.
What is your favorite thing you learned at The Data Incubator?
Understanding the concept of mapreduce and the location of District Taco.
Could you tell us about your Data Incubator Capstone project?
I started my project looking at the effect of food deserts on community health using the Food Environment Atlas from the USDA. I expected to see a correlation between the availability of grocery stores and measures of health, such as the rate of obesity in a community. Instead I found that health suffers in poor communities regardless of the availability of fresh food, and in correlation with the prevalence of fast food. This suggested that community health could be improved if more people were able to cook at home, so I extended my project to include tools to make cooking more accessible. These tools were based on an analysis of ingredient lists from recipes.