Debiased and Denoised Entity Recognition from Distant Supervision

May-28-2025, 21:07:15 GMT–Neural Information Processing Systems

While distant supervision has been extensively explored and exploited in NLP tasks like named entity recognition, a major obstacle stems from the inevitable noisy distant labels tagged unsupervisedly. A few past works approach this problem by adopting a self-training framework with a sample-selection mechanism. In this work, we innovatively identify two types of biases that were omitted by prior work, and these biases lead to inferior performance of the distant-supervised NER setup. First, we characterize the noise concealed in the distant labels as highly structural rather than fully randomized. Second, the self-training framework would ubiquitously introduce an inherent bias that causes erroneous behavior in both sample selection and eventually prediction.

computational linguistic, large language model, machine learning, (19 more...)

Neural Information Processing Systems

May-28-2025, 21:07:15 GMT

Conferences PDF

Add feedback

Country:
- Asia (0.68)
- Europe (1.00)
- North America > United States
  - Minnesota > Hennepin County > Minneapolis (0.14)

Genre:
- Research Report (0.67)

Technology:
- Information Technology > Artificial Intelligence
  - Machine Learning > Neural Networks
    - Deep Learning (1.00)
  - Natural Language
    - Large Language Model (0.99)
    - Text Processing (1.00)

Duplicate Docs Excel Report

Title
359ddb9caccb4c54cc915dceeacf4892-Paper-Conference.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found