Coping with low data availability for social media crisis message categorisation

May-26-2023–arXiv.org Artificial Intelligence

During crisis situations, social media allows people to quickly share information, including messages requesting help. This can be valuable to emergency responders, who need to categorise and prioritise these messages based on the type of assistance being requested. However, the high volume of messages makes it difficult to filter and prioritise them without the use of computational techniques. Fully supervised filtering techniques for crisis message categorisation typically require a large amount of annotated training data, but this can be difficult to obtain during an ongoing crisis and is expensive in terms of time and labour to create. This thesis focuses on addressing the challenge of low data availability when categorising crisis messages for emergency response. It first presents domain adaptation as a solution for this problem, which involves learning a categorisation model from annotated data from past crisis events (source domain) and adapting it to categorise messages from an ongoing crisis event (target domain). In many-to-many adaptation, where the model is trained on multiple past events and adapted to multiple ongoing events, a multi-task learning approach is proposed using pre-trained language models. This approach outperforms baselines and an ensemble approach further improves performance...

large language model, machine learning, natural language, (26 more...)

arXiv.org Artificial Intelligence

May-26-2023

arXiv.org PDF

Add feedback

Country:
- South America
  - Paraguay > Asunción
    - Asunción (0.04)
  - Colombia > Meta Department
    - Villavicencio (0.04)
  - Chile > Santiago Metropolitan Region
    - Santiago Province > Santiago (0.04)
- Oceania > Australia
  - Queensland (0.04)
  - Victoria > Melbourne (0.04)
- North America
  - Costa Rica (0.04)
  - United States
    - Texas (0.14)
    - Oklahoma (0.04)
    - Pennsylvania (0.04)
    - New Mexico (0.04)
    - Mississippi (0.04)
    - Minnesota > Hennepin County
      - Minneapolis (0.14)
    - Colorado > Denver County
      - Denver (0.04)
    - Arizona > Maricopa County
      - Scottsdale (0.04)
    - Louisiana > Orleans Parish
      - New Orleans (0.04)
    - Maryland
      - Montgomery County > Gaithersburg (0.04)
      - Howard County > Ellicott City (0.04)
      - Baltimore (0.04)
    - New York > Monroe County
      - Rochester (0.04)
    - Wisconsin > Dane County
      - Madison (0.04)
    - Virginia > Montgomery County
      - Blacksburg (0.04)
    - Washington > King County
      - Seattle (0.13)
      - Bellevue (0.04)
    - California
      - San Diego County > San Diego (0.04)
      - Santa Clara County > Palo Alto (0.04)
      - Fresno County (0.04)
  - Mexico > Mexico City
    - Mexico City (0.04)
  - Canada
    - Alberta (0.04)
    - British Columbia > Metro Vancouver Regional District
      - Vancouver (0.13)
- Europe
  - Germany (0.04)
  - Norway (0.04)
  - Slovenia (0.04)
  - Czechia > Prague (0.04)
  - Spain > Valencian Community
    - Valencia Province > Valencia (0.04)
  - Italy > Calabria
    - Catanzaro Province > Catanzaro (0.04)
  - Belgium
    - Brussels-Capital Region > Brussels (0.04)
    - Flanders > West Flanders
      - Bruges (0.04)
  - Finland > Uusimaa
    - Helsinki (0.04)
  - Ireland > Leinster
    - County Dublin > Dublin (0.04)
- Asia
  - Nepal (0.04)
  - Malaysia (0.04)
  - China (0.04)
  - Cambodia (0.04)
  - Bangladesh (0.04)
  - South Korea > Seoul
    - Seoul (0.04)
  - Middle East > Qatar
    - Ad-Dawhah > Doha (0.04)
- Africa
  - Mozambique (0.04)
  - Middle East > Morocco (0.04)
  - Ethiopia > Addis Ababa
    - Addis Ababa (0.04)

Genre:
- Workflow (1.00)
- Overview (1.00)
- Research Report
  - Promising Solution (1.00)
  - New Finding (1.00)
  - Experimental Study (0.92)

Industry:
- Media (1.00)
- Health & Medicine (1.00)
- Education (1.00)
- Government (0.67)
- Information Technology > Services (0.67)
- Energy > Power Industry (0.45)
- Leisure & Entertainment > Sports (0.45)
- Law Enforcement & Public Safety
  - Crime Prevention & Enforcement (0.67)
  - Terrorism (0.46)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence
    - Representation & Reasoning > Uncertainty
      - Bayesian Inference (0.67)
    - Natural Language
      - Text Processing (1.00)
      - Text Classification (1.00)
      - Large Language Model (1.00)
      - Machine Translation (0.92)
      - Chatbot (0.68)
    - Machine Learning
      - Statistical Learning (1.00)
      - Neural Networks > Deep Learning (1.00)
      - Inductive Learning (1.00)
      - Performance Analysis > Accuracy (0.93)
      - Learning Graphical Models > Directed Networks
        Bayesian Learning (1.00)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found