Inferring Missing Categorical Information in Noisy and Sparse Web Markup