A Neighbourhood Framework for Resource-Lean Content Flagging

Sarwar, Sheikh Muhammad, Zlatkova, Dimitrina, Hardalov, Momchil, Dinkov, Yoan, Augenstein, Isabelle, Nakov, Preslav

Mar-31-2021–arXiv.org Machine Learning

We propose a novel interpretable framework for cross-lingual content flagging, which significantly outperforms prior work both in terms of predictive performance and average inference time. The framework is based on a nearest-neighbour architecture and is interpretable by design. Moreover, it can easily adapt to new instances without the need to retrain it from scratch. Unlike prior work, (i) we encode not only the texts, but also the labels in the neighbourhood space (which yields better accuracy), and (ii) we use a bi-encoder instead of a cross-encoder (which saves computation time). Our evaluation results on ten different datasets for abusive language detection in eight languages shows sizable improvements over the state of the art, as well as a speed-up at inference time.

dataset, neighbour, representation, (14 more...)

arXiv.org Machine Learning

Mar-31-2021

arXiv.org PDF

Add feedback

Country:
- North America > United States
  - Texas > Travis County
    - Austin (0.04)
  - New York > New York County
    - New York City (0.04)
  - New Mexico > Santa Fe County
    - Santa Fe (0.04)
  - Minnesota > Hennepin County
    - Minneapolis (0.14)
  - Massachusetts > Hampshire County
    - Amherst (0.04)
- Europe
  - United Kingdom (0.14)
  - France (0.04)
  - Spain > Catalonia
    - Barcelona Province > Barcelona (0.04)
  - Italy > Tuscany
    - Florence (0.04)
  - Denmark > Capital Region
    - Copenhagen (0.04)
  - Belgium > Brussels-Capital Region
    - Brussels (0.04)
- Asia
  - China > Hong Kong (0.04)
  - Middle East
    - Qatar (0.04)
    - Jordan (0.04)
- Africa > Ethiopia
  - Addis Ababa > Addis Ababa (0.04)

Genre:
- Research Report (0.40)

Industry:
- Government (1.00)
- Law (0.68)

Technology:
- Information Technology
  - Communications > Social Media (1.00)
  - Artificial Intelligence
    - Representation & Reasoning (0.99)
    - Natural Language > Text Processing (0.68)
    - Machine Learning > Neural Networks
      - Deep Learning (0.93)

Duplicate Docs Excel Report

Title
None found

Similar Docs Excel Report more

Title	Similarity	Source
None found