AITopics | turker

Collaborating Authors

turker

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

68053af2923e00204c3ca7c6a3150cf7-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 10:57:17 GMT

annotation, evaluation, insecurity, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.32)

Add feedback

Deep Active Learning with Crowdsourcing Data for Privacy Policy Classification

Qiu, Wenjun, Lie, David

arXiv.org Artificial IntelligenceNov-4-2025

Privacy policies are statements that notify users of the services' data practices. However, few users are willing to read through policy texts due to the length and complexity. While automated tools based on machine learning exist for privacy policy analysis, to achieve high classification accuracy, classifiers need to be trained on a large labeled dataset. Most existing policy corpora are labeled by skilled human annotators, requiring significant amount of labor hours and effort. In this paper, we leverage active learning and crowdsourcing techniques to develop an automated classification tool named Calpric (Crowdsourcing Active Learning PRIvacy Policy Classifier), which is able to perform annotation equivalent to those done by skilled human annotators with high accuracy while minimizing the labeling cost. Specifically, active learning allows classifiers to proactively select the most informative segments to be labeled. On average, our model is able to achieve the same F1 score using only 62% of the original labeling effort. Calpric's use of active learning also addresses naturally occurring class imbalance in unlabeled privacy policy datasets as there are many more statements stating the collection of private information than stating the absence of collection. By selecting samples from the minority class for labeling, Calpric automatically creates a more balanced training set.

information retrieval, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

doi: 10.48550/arXiv.2401.08038

2008.02954

Country:

North America > Canada (0.68)
North America > United States > California (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(5 more...)

Add feedback

68053af2923e00204c3ca7c6a3150cf7-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 21:56:26 GMT

annotation, artificial intelligence, insecurity, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.32)

Add feedback

Outside Knowledge Conversational Video (OKCV) Dataset -- Dialoguing over Videos

Reichman, Benjamin, Patsch, Constantin, Truxal, Jack, Jain, Atishay, Heck, Larry

arXiv.org Artificial IntelligenceJun-12-2025

In outside knowledge visual question answering (OK-VQA), the model must identify relevant visual information within an image and incorporate external knowledge to accurately respond to a question. Extending this task to a visually grounded dialogue setting based on videos, a conversational model must both recognize pertinent visual details over time and answer questions where the required information is not necessarily present in the visual information. Moreover, the context of the overall conversation must be considered for the subsequent dialogue. To explore this task, we introduce a dataset comprised of $2,017$ videos with $5,986$ human-annotated dialogues consisting of $40,954$ interleaved dialogue turns. While the dialogue context is visually grounded in specific video segments, the questions further require external knowledge that is not visually present. Thus, the model not only has to identify relevant video parts but also leverage external knowledge to converse within the dialogue. We further provide several baselines evaluated on our dataset and show future challenges associated with this task. The dataset is made publicly available here: https://github.com/c-patsch/OKCV.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2506.09953

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Vision (0.94)
(3 more...)

Add feedback

Understanding ME? Multimodal Evaluation for Fine-grained Visual Commonsense

Wang, Zhecan, You, Haoxuan, He, Yicheng, Li, Wenhao, Chang, Kai-Wei, Chang, Shih-Fu

arXiv.org Artificial IntelligenceOct-23-2023

Visual commonsense understanding requires Vision Language (VL) models to not only understand image and text but also cross-reference in-between to fully integrate and achieve comprehension of the visual scene described. Recently, various approaches have been developed and have achieved high performance on visual commonsense benchmarks. However, it is unclear whether the models really understand the visual scene and underlying commonsense knowledge due to limited evaluation data resources. To provide an in-depth analysis, we present a Multimodal Evaluation (ME) pipeline to automatically generate question-answer pairs to test models' understanding of the visual scene, text, and related knowledge. We then take a step further to show that training with the ME data boosts the model's performance in standard VCR evaluation. Lastly, our in-depth analysis and comparison reveal interesting findings: (1) semantically low-level information can assist the learning of high-level information but not the opposite; (2) visual information is generally under utilization compared with text.

information, modality, vl model, (15 more...)

arXiv.org Artificial Intelligence

2211.05895

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)

Add feedback

ChatGPT Is Reshaping Crowd Work

WIREDJul-7-2023, 12:00:00 GMT

We owe our understanding of human behavior thanks, in part, to Bob. He spends hours some days as a subject in academic psychology studies, filling out surveys on crowd-work platforms like Amazon's Mechanical Turk, where users perform simple digital tasks for small sums of money. The questionnaires often prompt him to recall a time he felt sad, or isolated, or something likewise morose. Sometimes typing his sob stories over and over gets "really really monotonous," he says. So Bob asks ChatGPT to pour its simulacrum of a heart out instead.

chatgpt, language model, reshaping crowd work, (6 more...)

WIRED

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

A New Task and Dataset on Detecting Attacks on Human Rights Defenders

Ran, Shihao, Lu, Di, Tetreault, Joel, Cahill, Aoife, Jaimes, Alejandro

arXiv.org Artificial IntelligenceJun-30-2023

The ability to conduct retrospective analyses of attacks on human rights defenders over time and by location is important for humanitarian organizations to better understand historical or ongoing human rights violations and thus better manage the global impact of such events. We hypothesize that NLP can support such efforts by quickly processing large collections of news articles to detect and summarize the characteristics of attacks on human rights defenders. To that end, we propose a new dataset for detecting Attacks on Human Rights Defenders (HRDsAttack) consisting of crowdsourced annotations on 500 online news articles. The annotations include fine-grained information about the type and location of the attacks, as well as information about the victim(s). We demonstrate the usefulness of the dataset by using it to train and evaluate baseline models on several sub-tasks to predict the annotated characteristics.

hrdsattack, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.17695

Country:

Africa > Middle East > Algeria (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Burundi > Gitega > Gitega (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Genre: Research Report (1.00)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Communications > Social Media > Crowdsourcing (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

ClarifyDelphi: Reinforced Clarification Questions with Defeasibility Rewards for Social and Moral Situations

Pyatkin, Valentina, Hwang, Jena D., Srikumar, Vivek, Lu, Ximing, Jiang, Liwei, Choi, Yejin, Bhagavatula, Chandra

arXiv.org Artificial IntelligenceMay-30-2023

Context is everything, even in commonsense moral reasoning. Changing contexts can flip the moral judgment of an action; "Lying to a friend" is wrong in general, but may be morally acceptable if it is intended to protect their life. We present ClarifyDelphi, an interactive system that learns to ask clarification questions (e.g., why did you lie to your friend?) in order to elicit additional salient contexts of a social or moral situation. We posit that questions whose potential answers lead to diverging moral judgments are the most informative. Thus, we propose a reinforcement learning framework with a defeasibility reward that aims to maximize the divergence between moral judgments of hypothetical answers to a question. Human evaluation demonstrates that our system generates more relevant, informative and defeasible questions compared to competitive baselines. Our work is ultimately inspired by studies in cognitive science that have investigated the flexibility in moral cognition (i.e., the diverse contexts in which moral rules can be bent), and we hope that research in this direction can assist both cognitive and computational investigations of moral judgments.

machine learning, natural language, reinforcement learning, (19 more...)

arXiv.org Artificial Intelligence

2212.10409

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Utah (0.04)
North America > Dominican Republic (0.04)
Asia > Middle East > UAE (0.04)

Genre:

Research Report (1.00)
Personal > Interview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

The Fewer Splits are Better: Deconstructing Readability in Sentence Splitting

Nomoto, Tadashi

arXiv.org Artificial IntelligenceFeb-2-2023

In this work, we focus on sentence splitting, a subfield of text simplification, motivated largely by an unproven idea that if you divide a sentence in pieces, it should become easier to understand. Our primary goal in this paper is to find out whether this is true. In particular, we ask, does it matter whether we break a sentence into two or three? We report on our findings based on Amazon Mechanical Turk. More specifically, we introduce a Bayesian modeling framework to further investigate to what degree a particular way of splitting the complex sentence affects readability, along with a number of other parameters adopted from diverse perspectives, including clinical linguistics, and cognitive linguistics. The Bayesian modeling experiment provides clear evidence that bisecting the sentence leads to enhanced readability to a degree greater than what we create by trisection.

machine learning, natural language, simplification, (18 more...)

arXiv.org Artificial Intelligence

2302.00937

Country: