In this paper we describe computational approaches to summarizing dynamically introduced information: online discussions and blogs, and their evaluations. Research in the past has been mainly focused on text-based summarization where the input data is predominantly newswire data. When branching into these newly emerged data types, we face number of difficulties that are discussed here.
For example, a generational frame for retaliation is: "When Y caused a negative event for X, X caused a negative event for Y." This is a conceptually abstract description of retaliation. To produce a reasonable summary, we must (1) instantiate the generational frame, and (2) augment it with information from units adjacent to the pivotal unit. We will try to convey what's involved by showing how a baseline summary evolves into a reasonable summary with the addition of information from adjacent units.
Human-quality text summarization systems are difficult to design, and even more difficult to evaluate, in part because documents can differ along several dimensions, such as length, writing style and lexical usage. Nevertheless, certain cues can often help suggest the selection of sentences for inclusion in a summary. This paper presents an analysis of newsarticle summaries generated by sentence extraction. Sentences are ranked for potential inclusion in the summary using a weighted combination of linguistic features - derived from an analysis of news-wire summaries. This paper evaluates the relative effectiveness of these features. In order to do so, we discuss the construction of a large corpus of extractionbased summaries, and characterize the underlying degree of difficulty of summarization at different compression levels on articles in this corpus. Results on our feature set are presented after normalization by this degree of difficulty.
Due to the recent boom in artificial intelligence (AI) research, including computer vision (CV), it has become impossible for researchers in these fields to keep up with the exponentially increasing number of manuscripts. In response to this situation, this paper proposes the paper summary generation (PSG) task using a simple but effective method to automatically generate an academic paper summary from raw PDF data. We realized PSG by combination of vision-based supervised components detector and language-based unsupervised important sentence extractor, which is applicable for a trained format of manuscripts. We show the quantitative evaluation of ability of simple vision-based components extraction, and the qualitative evaluation that our system can extract both visual item and sentence that are helpful for understanding. After processing via our PSG, the 979 manuscripts accepted by the Conference on Computer Vision and Pattern Recognition (CVPR) 2018 are available. It is believed that the proposed method will provide a better way for researchers to stay caught with important academic papers.