Learning from Multiple Partially Observed Views - an Application to Multilingual Text Categorization

Amini, Massih R., Usunier, Nicolas, Goutte, Cyril

Neural Information Processing Systems 

We address the problem of learning classifiers when observations have multiple views, some of which may not be observed for all examples. We assume the existence of view generating functions which may complete the missing views in an approximate way. This situation corresponds for example to learning text classifiers from multilingual collections where documents are not available in all languages. In that case, Machine Translation (MT) systems may be used to translate each document in the missing languages. We derive a generalization error bound for classifiers learned on examples with multiple artificially created views.