AITopics | Sarwar, Sheikh Muhammad

Collaborating Authors

Sarwar, Sheikh Muhammad

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Target Span Detection for Implicit Harmful Content

Jafari, Nazanin, Allan, James, Sarwar, Sheikh Muhammad

arXiv.org Artificial IntelligenceJun-27-2024

Identifying the targets of hate speech is a crucial step in grasping the nature of such speech and, ultimately, in improving the detection of offensive posts on online forums. Much harmful content on online platforms uses implicit language especially when targeting vulnerable and protected groups such as using stereotypical characteristics instead of explicit target names, making it harder to detect and mitigate the language. In this study, we focus on identifying implied targets of hate speech, essential for recognizing subtler hate speech and enhancing the detection of harmful content on digital platforms. We define a new task aimed at identifying the targets even when they are not explicitly stated. To address that task, we collect and annotate target spans in three prominent implicit hate speech datasets: SBIC, DynaHate, and IHC. We call the resulting merged collection Implicit-Target-Span. The collection is achieved using an innovative pooling method with matching scores based on human annotations and Large Language Models (LLMs). Our experiments indicate that Implicit-Target-Span provides a challenging test bed for target span detection methods.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.19836

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Detecting Harmful Content On Online Platforms: What Platforms Need Vs. Where Research Efforts Go

Arora, Arnav, Nakov, Preslav, Hardalov, Momchil, Sarwar, Sheikh Muhammad, Nayak, Vibha, Dinkov, Yoan, Zlatkova, Dimitrina, Dent, Kyle, Bhatawdekar, Ameya, Bouchard, Guillaume, Augenstein, Isabelle

arXiv.org Artificial IntelligenceJun-6-2023

The proliferation of harmful content on online platforms is a major societal problem, which comes in many different forms including hate speech, offensive language, bullying and harassment, misinformation, spam, violence, graphic content, sexual abuse, self harm, and many other. Online platforms seek to moderate such content to limit societal harm, to comply with legislation, and to create a more inclusive environment for their users. Researchers have developed different methods for automatically detecting harmful content, often focusing on specific sub-problems or on narrow communities, as what is considered harmful often depends on the platform and on the context. We argue that there is currently a dichotomy between what types of harmful content online platforms seek to curb, and what research efforts there are to automatically detect such content. We thus survey existing methods as well as content moderation policies by online platforms in this light and we suggest directions for future work.

computational linguistic, data mining, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2103.00153

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)
North America > United States > Massachusetts (0.28)
(3 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Services (1.00)
(4 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science (1.00)
Information Technology > Communications > Social Media (1.00)
(2 more...)

Add feedback

AutoTriggER: Label-Efficient and Robust Named Entity Recognition with Auxiliary Trigger Extraction

Lee, Dong-Ho, Selvam, Ravi Kiran, Sarwar, Sheikh Muhammad, Lin, Bill Yuchen, Morstatter, Fred, Pujara, Jay, Boschee, Elizabeth, Allan, James, Ren, Xiang

arXiv.org Artificial IntelligenceMay-18-2023

Deep neural models for named entity recognition (NER) have shown impressive results in overcoming label scarcity and generalizing to unseen entities by leveraging distant supervision and auxiliary information such as explanations. However, the costs of acquiring such additional information are generally prohibitive. In this paper, we present a novel two-stage framework (AutoTriggER) to improve NER performance by automatically generating and leveraging ``entity triggers'' which are human-readable cues in the text that help guide the model to make better decisions. Our framework leverages post-hoc explanation to generate rationales and strengthens a model's prior knowledge using an embedding interpolation technique. This approach allows models to exploit triggers to infer entity boundaries and types instead of solely memorizing the entity words themselves. Through experiments on three well-studied NER datasets, AutoTriggER shows strong label-efficiency, is capable of generalizing to unseen entities, and outperforms the RoBERTa-CRF baseline by nearly 0.5 F1 points on average.

artificial intelligence, natural language, text processing, (4 more...)

arXiv.org Artificial Intelligence

2109.04726

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

A Neighbourhood Framework for Resource-Lean Content Flagging

Sarwar, Sheikh Muhammad, Zlatkova, Dimitrina, Hardalov, Momchil, Dinkov, Yoan, Augenstein, Isabelle, Nakov, Preslav

arXiv.org Machine LearningMar-31-2021

We propose a novel interpretable framework for cross-lingual content flagging, which significantly outperforms prior work both in terms of predictive performance and average inference time. The framework is based on a nearest-neighbour architecture and is interpretable by design. Moreover, it can easily adapt to new instances without the need to retrain it from scratch. Unlike prior work, (i) we encode not only the texts, but also the labels in the neighbourhood space (which yields better accuracy), and (ii) we use a bi-encoder instead of a cross-encoder (which saves computation time). Our evaluation results on ten different datasets for abusive language detection in eight languages shows sizable improvements over the state of the art, as well as a speed-up at inference time.

dataset, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

2103.17055

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.40)

Industry: Government (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback