David was a Fellow in our Winter 2016 cohort who landed a job at our hiring partner, Amazon.
Tell us about your background. How did it set you up to be a great Data Scientist?
Before joining The Data Incubator, I completed my Ph.D. in chemistry at Johns Hopkins University, where I focused on the design and synthesis of new magnetic materials. My work gave me the opportunity to work alongside scientists in many different disciplines, and exposed me to a vast array of experimental techniques and theoretical constructs. From a data science perspective, this meant that I was constantly encountering new types of data and searching for scientifically rigorous models to explain those results. As the volume and complexity of these datasets increased, graphical data analysis tools like Excel and Origin weren’t making the cut for me, and I gradually made the transition to performing data transformation and analysis entirely in Python. That was a big technical leap that took a lot of time and frustration, but I think it ultimately made me a better researcher.
From a research perspective, working in a vibrant academic setting also meant learning how to ask bold questions, even at the risk of sounding stupid in front of a large group of mentors and peers–something I’ve done more than I care to admit. For me, finding the right question to ask is just as important as having the technical expertise to find an answer, and that’s one of the things that makes Data Science so exciting.
What do you think you got out of The Data Incubator?
What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?
Start learning Python! Codecademy and Google have great tutorials for new programmers. Once you’ve got the basics down, start applying it to your research, bit by bit–Stack Exchange will be your best friend. Get used to diving into new packages and deciphering their documentation, because you’re going to be doing it a lot. Most importantly, don’t get discouraged by how much there is to learn.
What is your favorite thing you learned at The Data Incubator?
I had never been exposed to graph analysis before, so when we used NetworkX to build a social network in the first week at TDI, I was completely blown away. I’ve ended up using NetworkX in interviews and personal projects several times since that first week because it provides a really intuitive and efficient way to deal with complex networks.
Could you tell us about your Data Incubator Capstone project?
When I moved across the country to start grad school five years ago, I had no idea where to live, and no idea how to search for a place besides reading blogs and searching Craigslist. It was a frustrating and scary problem, to say the least. To solve that problem, I created a web application that leverages nine different geospatial datasets, containing information such as crime rates and grocery store locations, to help a user target their search for a new home in Baltimore city. The primary functionality makes use of a statistical method called Gaussian kernel density estimation in order to compute recommended hotspots for each user and then display those hot spots on a map. I built the application using Python, Flask, Cartodb and Bootstrap, all of which are topics covered in TDI’s curriculum. Check it out at: http://stomping-grounds.herokuapp.com