Aviv was a Fellow in our Fall 2016 cohort who landed a job with our hiring partner, Argyle Data.
Tell us about your background. How did it set you up to be a great Data Scientist?
My background is in Geosciences. I was a climate modeler so I had a substantial amount of experience with scientific computing (numerical linear algebra, differential equations, data assimilation, etc).
What do you think you got out of The Data Incubator?
Two things. First, I got my first exposure to data science in a non-academic setting: what sort of problems data scientists might be tasked with solving within a company, and the tools they use to do so.
Second, and more importantly, I got to know a fantastic group of people and make valuable connections. I got the interview for my current position based on a recommendation from a friend from my cohort who had been hired prior to me (we now work together, which is great!). Recently, I got another email from a friend indicating that he would be happy to refer me to his company if I wished. I don’t know if I just happened to have been part of a particularly great cohort, but I really did have a blast going through the incubator with them, and I look forward to keeping in touch with all of them for years to come.
What advice would you give to someone who is applying for The Data Incubator, particularly someone with your background?
Start programming in Python ASAP. I had lots of experience with Matlab, and some with R, and still, it was a steep learning curve. And there’s really there’s no good reason to try and pick up Python and data science simultaneously. For me hitting tons of annoying little syntax idiosyncrasies every step of the way just sapped bandwidth that would have been better used to assimilate the Data Science information we were being bombarded with on a weekly basis. Moreover, whatever you think of Python as a programming language, if you want a data science job you’d better get on board. As I was told in a job interview: “R will not get you job in Silicon Valley”. So get comfortable being uncomfortable.
What is your favorite thing you learned at The Data Incubator?
I really liked PySpark SQLContext. I thought it was a very intelligently designed tool for working with distributed data in way which felt almost like working with in-memory data.
Could you tell us about your Data Incubator Capstone project?
I made a model that looked at the relationship between Bay Area housing prices and geologic hazard. There’s actually a positive relationship between the two, suggesting that the highest value zip codes also tend to be associated with greater risk of earthquakes and landslides.
How did you come up with the idea for the project?
Hiking around the Bay Area I saw million-dollar mansions being built in very steep and unlikely places. I imagined that the people building those houses were probably unaware of the risk they were taking upon themselves of their dream home sliding down the mountain.
How could others use your data?
Well, after I started the project I discovered that some folks had built a neat little app called Temblor that does something similar. The folks there are trying to disrupt the Earthquake insurance industry and improve awareness of seismic hazards. I hope they’re successful in doing so.