Andrew was a Fellow in our summer cohort who landed a job with one of our hiring partners, Freddie Mac, after finishing his PhD at Yale.
Tell us about your background. How did it set you up to be a great Data Scientist?
Before applying to the Data Incubator, I received my Ph.D. in particle physics. I spent years analyzing proton collisions from a particle detector which collects several petabytes of data each year. This experience set me up to be a great data scientist for two important reasons. For one thing, it gave me plenty of experience working with real data. From this, I developed a good understanding of each stage of the data analysis process and the challenges associated with it. The other reason my background prepared me for a career in data science was that it helped me develop great skills as an experimentalist in general. I learned not only what to look for when designing an experiment, but also how to set up and conduct the experiment to obtain meaningful results.
What do you think you got out of The Data Incubator?
For starters, I gained experience using several data science tools and techniques I hadn’t used before. Many of these tools are commonly used in industry, unlike some of the special-purpose software I had worked with for my research in physics. The mini projects I completed with these tools provided great talking points for discussing my experience in interviews.
Another major benefit that I got from The Data Incubator was access to potential employers. By being able to reach out directly to hiring managers at partner companies, I was able to bypass the painful process of submitting my resume to a giant stack and hoping that it will be noticed. Through these connections, I was able to set up interviews with some awesome companies that would have likely not paid me any attention otherwise.
Finally, I got to meet and work alongside the other Fellows. These were all incredibly talented individuals who had great insights into solving many challenging data science problems. I must have I learned at least as much from them as from the official coursework.
What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?
I would start by suggesting that you think of a project that you would particularly like to work on, preferably something that may have some applications in industry. I would also take some time to learn some tools that are commonly used for data science, especially ones that you may not have used before: perhaps R, Python tools (Scikit, Pandas, Numpy, etc.), SQL, or others. I would practice using these tools, preferably on your pet project. These will not only improve your skills as a data scientist, but it will also give you a great head start on a project to show off in interviews.
I would also suggest you start brushing up on your basic probability and programming skills. These will be crucial not only for the Data Incubator application, but for any data science interviews that you land. I have found that practice is key here. [Editor’s Note: For more information about how to prepare for The Data Incubator, check out this post.]
Lastly, start thinking about which industries you would like to work in. This may help not only with finding a project, but also with focusing your job search.
What is your favorite thing you learned at The Data Incubator?
My favorite technology that I learned from The Data Incubator was IPython Notebook. I was already familiar with Python, but this tool made it so much easier to experiment with the command line. I also really enjoyed using Python’s Scikit Learn machine learning package for its easy-to-use algorithms.
Could you tell us about your Data Incubator mini projects?
Over the course of the program, I completed several mini projects, usually 1-2 per week. Each mini project was designed to develop proficiency with a different aspect of data science. These included such topics as database querying, machine learning, and MapReduce. Each project usually included one or two simpler tasks to help understand the basic premise of the topic, but also much more challenging tasks designed to really push your critical thinking and problem solving skills. What I think worked best about the mini projects was that they presented challenges that many data scientists face on a day-to-day basis. In many cases I had to discover and teach myself new tools in order to solve these problems. Whenever I didn’t obtain the results I was expecting, I had to systematically locate the problems and experiment with new ways of fixing them.