WikiSeq: Mining Maximally Informative Simple Sequences from Wikipedia

Nair, Goutam (International Institute of Information Technology, Hyderabad) | Pudi, Vikram (International Institute of Information Technology, Hyderabad)

AAAI Conferences 

The problem of ordering documents in a large collection into a sequence that is efficient for learning (both human and machine) is of high practical significance, but has not yet been well-formulated. We formulate this problem as mining a maximally informative simple sequence of documents. The mined sequence should be maximally informative in the sense that the reader learns quickly by reading only a few documents, and it should be simple so that the reader is not overwhelmed while trying to learn the content. The task can be posed as: Given that a reader wishes to read (at most) k documents, which documents should be selected from the repository and in what order, so as to provide maximum information. We present the WikiSeq algorithm for this purpose. We also design a metric based on information-gain to help objectively evaluate WikiSeq, and conduct experiments to compare with indicative baselines. Finally, we provide case-studies to subjectively illustrate WikiSeq’s merits.

Duplicate Docs Excel Report

Title
None found

Similar Docs  Excel Report  more

TitleSimilaritySource
None found