Towards Personalized and Human-in-the-Loop Document Summarization

Aug-21-2021–arXiv.org Artificial Intelligence

The ubiquitous availability of computing devices and the widespread use of the internet have generated a large amount of data continuously. Therefore, the amount of available information on any given topic is far beyond humans' processing capacity to properly process, causing what is known as information overload. To efficiently cope with large amounts of information and generate content with significant value to users, we require identifying, merging and summarising information. Data summaries can help gather related information and collect it into a shorter format that enables answering complicated questions, gaining new insight and discovering conceptual boundaries. This thesis focuses on three main challenges to alleviate information overload using novel summarisation techniques. It further intends to facilitate the analysis of documents to support personalised information extraction. This thesis separates the research issues into four areas, covering (i) feature engineering in document summarisation, (ii) traditional static and inflexible summaries, (iii) traditional generic summarisation approaches, and (iv) the need for reference summaries. We propose novel approaches to tackle these challenges, by: i)enabling automatic intelligent feature engineering, ii) enabling flexible and interactive summarisation, iii) utilising intelligent and personalised summarisation approaches. The experimental results prove the efficiency of the proposed approaches compared to other state-of-the-art models. We further propose solutions to the information overload problem in different domains through summarisation, covering network traffic data, health data and business process data.

automatic intelligent feature engineering, computational natural language learning, iot-enabled process data analytic pipeline, (12 more...)

arXiv.org Artificial Intelligence

Aug-21-2021

arXiv.org PDF

Add feedback

Country:
- Oceania > Australia
  - New South Wales
    - Sydney (0.14)
    - Wollongong (0.04)
- North America > United States
  - New York (0.04)
  - Hawaii (0.04)
  - Colorado (0.04)
  - Arizona (0.04)
  - Wisconsin > Dane County
    - Madison (0.04)
  - Virginia > Fairfax County
    - McLean (0.04)
  - Florida > Miami-Dade County
    - Coral Gables (0.04)
  - California > Santa Clara County
    - Palo Alto (0.04)
- Europe
  - Czechia > Prague (0.04)
  - United Kingdom > England
    - Greater London > London (0.04)
    - Cambridgeshire > Cambridge (0.04)
  - Portugal > Braga
    - Braga (0.04)
  - Middle East > Cyprus
    - Pafos > Paphos (0.04)
  - Germany > Bavaria
    - Upper Bavaria > Munich (0.04)
  - France
    - Nouvelle-Aquitaine > Gironde
      - Bordeaux (0.04)
    - Auvergne-Rhône-Alpes > Puy-de-Dôme
      - Clermont-Ferrand (0.04)
  - Estonia > Harju County
    - Tallinn (0.04)
  - Austria > Upper Austria
    - Linz (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Singapore (0.04)
  - Macao (0.04)
  - Taiwan > Taiwan Province
    - Taipei (0.04)
  - Middle East > Iran
    - Tehran Province > Tehran (0.04)

Genre:
- Overview (1.00)
- Research Report
  - Promising Solution (1.00)
  - New Finding (1.00)

Industry:
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Education (0.92)
- Media > News (0.92)
- Government (0.92)
- Law (0.67)
- Information Technology
  - Services (0.92)
  - Security & Privacy (0.67)
- Health & Medicine
  - Pharmaceuticals & Biotechnology (1.00)
  - Therapeutic Area > Oncology (0.67)
  - Health Care Technology > Medical Record (0.67)

Technology:
- Information Technology
  - Data Science > Data Mining (1.00)
  - Communications
    - Web (1.00)
    - Social Media (1.00)
    - Networks (1.00)
  - Artificial Intelligence
    - Cognitive Science > Problem Solving (0.93)
    - Representation & Reasoning
      - Search (1.00)
      - Expert Systems (1.00)
      - Uncertainty
        Fuzzy Logic (0.67)
        Bayesian Inference (0.67)
    - Natural Language
      - Text Processing (1.00)
      - Information Retrieval (1.00)
      - Grammars & Parsing (1.00)
    - Machine Learning
      - Reinforcement Learning (1.00)
      - Neural Networks > Deep Learning (1.00)
      - Evolutionary Systems (1.00)
      - Statistical Learning > Clustering (0.93)
      - Performance Analysis > Accuracy (0.67)
      - Learning Graphical Models
        Directed Networks > Bayesian Learning (1.00)
        Undirected Networks > Markov Models (0.67)