Towards Personalized and Human-in-the-Loop Document Summarization
–arXiv.org Artificial Intelligence
The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.
arXiv.org Artificial Intelligence
Aug-21-2021
- Country:
- Asia
- China > Hong Kong (0.04)
- Macao (0.04)
- Middle East > Iran
- Tehran Province > Tehran (0.04)
- Singapore (0.04)
- Taiwan > Taiwan Province
- Taipei (0.04)
- Turkmenistan > Aspheron Ridge (0.04)
- Europe
- Austria > Upper Austria
- Linz (0.04)
- Czechia > Prague (0.04)
- Estonia > Harju County
- Tallinn (0.04)
- France
- Auvergne-Rhône-Alpes > Puy-de-Dôme
- Clermont-Ferrand (0.04)
- Nouvelle-Aquitaine > Gironde
- Bordeaux (0.04)
- Auvergne-Rhône-Alpes > Puy-de-Dôme
- Germany > Bavaria
- Upper Bavaria > Munich (0.04)
- Middle East > Cyprus
- Portugal > Braga
- Braga (0.04)
- United Kingdom > England
- Cambridgeshire > Cambridge (0.04)
- Greater London > London (0.04)
- Austria > Upper Austria
- North America > United States
- Colorado (0.04)
- Florida > Miami-Dade County
- Coral Gables (0.04)
- Wisconsin > Dane County
- Madison (0.04)
- Virginia > Fairfax County
- McLean (0.04)
- California > Santa Clara County
- Palo Alto (0.04)
- Hawaii (0.04)
- Arizona (0.04)
- New York (0.04)
- Wyoming > Campbell County (0.04)
- Oceania > Australia
- New South Wales
- Sydney (0.14)
- Wollongong (0.04)
- New South Wales
- Asia
- Genre:
- Overview (1.00)
- Research Report
- New Finding (1.00)
- Promising Solution (1.00)
- Industry:
- Education (0.92)
- Government (0.92)
- Health & Medicine
- Health Care Technology > Medical Record (0.67)
- Pharmaceuticals & Biotechnology (1.00)
- Therapeutic Area > Oncology (0.67)
- Information Technology
- Security & Privacy (0.67)
- Services (0.92)
- Law (0.67)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Media > News (0.92)
- Technology:
- Information Technology
- Artificial Intelligence
- Cognitive Science > Problem Solving (0.93)
- Machine Learning
- Evolutionary Systems (1.00)
- Learning Graphical Models
- Directed Networks > Bayesian Learning (1.00)
- Undirected Networks > Markov Models (0.67)
- Neural Networks > Deep Learning (1.00)
- Performance Analysis > Accuracy (0.67)
- Reinforcement Learning (1.00)
- Statistical Learning > Clustering (0.93)
- Natural Language
- Grammars & Parsing (1.00)
- Information Retrieval (1.00)
- Text Processing (1.00)
- Representation & Reasoning
- Expert Systems (1.00)
- Search (1.00)
- Uncertainty
- Bayesian Inference (0.67)
- Fuzzy Logic (0.67)
- Communications
- Networks (1.00)
- Social Media (1.00)
- Web (1.00)
- Data Science > Data Mining (1.00)
- Artificial Intelligence
- Information Technology