An aggregate learning approach for interpretable semi-supervised population prediction and disaggregation using ancillary data