One of our fellows recently had a piece published about her very unique and timely capstone project. The original piece is posted on Data Driven Journalism.
In her own words:
This war is not only important due to its staggering costs (both human and financial) but also on account of its publicly available and well-documented daily records from 2004 to 2010.
These documents provide a very high spatial and temporal resolution view of the conflict. For example, I extracted from these government memos the number of violent events per day in each county. Then, using latent factor analysis techniques, e.g. non-negative matrix factorization, I was able to cluster the top three principal war zones. Interestingly these principal conflict zones were areas populated by the three main ethno-religious groups in Iraq.
You can watch her explain it herself:
Editor’s Note: The Data Incubator is a data science education company. We offer a free eight-week fellowship helping candidates with PhDs and masters degrees enter data science careers. Companies can hire talented data scientists or enroll employees in our data science corporate training.