Scaling Machine Learning Data Repair Systems for Sparse Datasets