Predicting civil war interventions
I am currently working on a project to build predictive models of civil war intervention. There are numerous studies seeking to explain third-party interventions into civil wars, but most focus on context-dependent factors and proximate causes. Furthermore, they do not tell us much about which factors are more or less important. I therefore set out to build predictive model(s) of intervention in order to aid in theory development and establish general empirical patterns of the internationalization of civil wars.
In short, I am building several models predicting who intervenes and on which side of a civil war in a global sample of civil wars from 1975 to 2017, with a total of 34,552 observations. I take an iterative approach to model training, tuning, and optimization based on Box’s loop, and I use several diagnostic tools to evaluate performance (F1-score), eliminate unnecessary features (recursive feature elimination and elastic net), and include new ones (Colaresi and Mahmood's diagnostic plot procedure). Because this is the first predictive model of intervention that I am aware of, I train and compare models using several different algorithms (random forest, SVM, XGBoost DART) to find the one that performs best. So far, the best performing model uses XGBoost DART with around 70 features, and it performs about 61% better than the benchmark model using multinomial logit.
While this project is a work in progress, all relevant files needed to produce the work so far, as well as various graphs, can be found on GitHub.
In short, I am building several models predicting who intervenes and on which side of a civil war in a global sample of civil wars from 1975 to 2017, with a total of 34,552 observations. I take an iterative approach to model training, tuning, and optimization based on Box’s loop, and I use several diagnostic tools to evaluate performance (F1-score), eliminate unnecessary features (recursive feature elimination and elastic net), and include new ones (Colaresi and Mahmood's diagnostic plot procedure). Because this is the first predictive model of intervention that I am aware of, I train and compare models using several different algorithms (random forest, SVM, XGBoost DART) to find the one that performs best. So far, the best performing model uses XGBoost DART with around 70 features, and it performs about 61% better than the benchmark model using multinomial logit.
While this project is a work in progress, all relevant files needed to produce the work so far, as well as various graphs, can be found on GitHub.