Missing Data in the Age of Machine Learning
Machine learning algorithms, especially artificial neural networks, are not tolerant of missing data. Many practitioners simply remove records with missing fields without any consideration for the potential statistical bias that might be introduced. The field of imputation has become mature with imputations not only predicting missing values, but reflecting the uncertainty in the prediction. Traditional statistical estimators make use of the full benefits offered by advanced imputation techniques. This tutorial illustrates techniques and architectures that can incorporate advanced imputation techniques into machine learning pipelines including artificial neural networks.