AITopics | data collector

Collaborating Authors

data collector

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

8797d13e5998acfab387d4bf0a5b9b00-Paper-Conference.pdf

Neural Information Processing SystemsFeb-15-2026, 16:55:02 GMT

data mining, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Florida > Hillsborough County > University (0.04)
North America > United States > Arizona (0.04)
(3 more...)

Industry:

Information Technology (0.46)
Materials (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Add feedback

An Algorithmic Framework For Differentially Private Data Analysis on Trusted Processors

Joshua Allen, Bolin Ding, Janardhan Kulkarni, Harsha Nori, Olga Ohrimenko, Sergey Yekhanin

Neural Information Processing SystemsFeb-11-2026, 21:57:20 GMT

Theglobalmodel of differential privacy,which assumes that users trust the data collector, provides strong privacy guarantees and introduces small errors in the output.

adversary, algorithm, artificial intelligence, (16 more...)

Neural Information Processing Systems

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Security & Privacy (0.68)
Information Technology > Data Science (0.47)
Information Technology > Artificial Intelligence (0.47)

Add feedback

An Algorithmic Framework For Differentially Private Data Analysis on Trusted Processors

Neural Information Processing SystemsDec-25-2025, 05:51:33 GMT

algorithmic framework, differential privacy, differentially private data analysis, (7 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.41)

Add feedback

Reinforcement-Enhanced Autoregressive Feature Transformation: Gradient-steered Search in Continuous Space for Postfix Expressions

Neural Information Processing SystemsOct-9-2025, 00:33:51 GMT

There are two main challenges in solving AFT: 1) efficient feature transformation in a massive discrete search space; 2) robust feature transformation in an open learning environment.

data mining, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Arizona (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Africa > Cameroon > Far North Region > Maroua (0.04)

Industry:

Education (0.48)
Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

An Algorithmic Framework For Differentially Private Data Analysis on Trusted Processors

Joshua Allen, Bolin Ding, Janardhan Kulkarni, Harsha Nori, Olga Ohrimenko, Sergey Yekhanin

Neural Information Processing SystemsOct-2-2025, 13:03:43 GMT

The global model of differential privacy, which assumes that users trust the data collector, provides strong privacy guarantees and introduces small errors in the output.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America (0.28)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining (0.68)

Add feedback

Reviewer

Neural Information Processing SystemsAug-22-2025, 02:11:34 GMT

The reviewer's comments show a misunderstanding concerning what is achieved by our protocol Differential privacy is not useful in this scenario. DP cannot be used to single out individual "bad" entries. It is misleading to directly compare DP with SMC. In many ways, they complement each other. We have provided a fairly general solution to an important problem: text classification.

accuracy, information, protocol, (14 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.53)
Information Technology > Artificial Intelligence > Natural Language (0.38)

Add feedback

A Federated Approach to Few-Shot Hate Speech Detection for Marginalized Communities

Ye, Haotian, Wisiorek, Axel, Maronikolakis, Antonis, Alaçam, Özge, Schütze, Hinrich

arXiv.org Artificial IntelligenceDec-6-2024

Hate speech online remains an understudied issue for marginalized communities, and has seen rising relevance, especially in the Global South, which includes developing societies with increasing internet penetration. In this paper, we aim to provide marginalized communities living in societies where the dominant language is low-resource with a privacy-preserving tool to protect themselves from hate speech on the internet by filtering offensive content in their native languages. Our contribution in this paper is twofold: 1) we release REACT (REsponsive hate speech datasets Across ConTexts), a collection of high-quality, culture-specific hate speech detection datasets comprising seven distinct target groups in eight low-resource languages, curated by experienced data collectors; 2) we propose a solution to few-shot hate speech detection utilizing federated learning (FL), a privacy-preserving and collaborative learning approach, to continuously improve a central model that exhibits robustness when tackling different target groups and languages. By keeping the training local to the users' devices, we ensure the privacy of the users' data while benefitting from the efficiency of federated learning. Furthermore, we personalize client models to target-specific training data and evaluate their performance. Our results indicate the effectiveness of FL across different target groups, whereas the benefits of personalization on few-shot learning are not clear.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2412.04942

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(22 more...)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

An Algorithmic Framework For Differentially Private Data Analysis on Trusted Processors

Neural Information Processing SystemsOct-9-2024, 20:11:14 GMT

Differential privacy has emerged as the main definition for private data analysis and machine learning. The global model of differential privacy, which assumes that users trust the data collector, provides strong privacy guarantees and introduces small errors in the output. Here, users do not trust the data collector, and hence randomize their data before sending it to the data collector. Unfortunately, local model is too strong for several important applications and hence is limited in its applicability. In this work, we propose a framework based on trusted processors and a new definition of differential privacy called Oblivious Differential Privacy, which combines the best of both local and global models.

algorithmic framework, differential privacy, differentially private data analysis, (4 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Security & Privacy (0.66)
Information Technology > Data Science (0.66)
Information Technology > Artificial Intelligence > Machine Learning (0.45)

Add feedback

Truthful Dataset Valuation by Pointwise Mutual Information

Zheng, Shuran, Kwon, Yongchan, Qi, Xuan, Zou, James

arXiv.org Artificial IntelligenceMay-28-2024

A common way to evaluate a dataset in ML involves training a model on this dataset and assessing the model's performance on a test set. However, this approach has two issues: (1) it may incentivize undesirable data manipulation in data marketplaces, as the self-interested data providers seek to modify the dataset to maximize their evaluation scores; (2) it may select datasets that overfit to potentially small test sets. We propose a new data valuation method that provably guarantees the following: data providers always maximize their expected score by truthfully reporting their observed data. Any manipulation of the data, including but not limited to data duplication, adding random data, data removal, or re-weighting data from different groups, cannot increase their expected score. Our method, following the paradigm of proper scoring rules, measures the pointwise mutual information (PMI) of the test dataset and the evaluated dataset. However, computing the PMI of two datasets is challenging. We introduce a novel PMI measuring method that greatly improves tractability within Bayesian machine learning contexts. This is accomplished through a new characterization of PMI that relies solely on the posterior probabilities of the model parameter at an arbitrarily selected value. Finally, we support our theoretical results with simulations and further test the effectiveness of our data valuation method in identifying the top datasets among multiple data providers. Interestingly, our method outperforms the standard approach of selecting datasets based on the trained model's test performance, suggesting that our truthful valuation score can also be more robust to overfitting.

data provider, dataset, pmi score, (13 more...)

arXiv.org Artificial Intelligence

2405.18253

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Europe > France (0.04)

Genre: Research Report (0.83)

Industry: Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Add feedback

Unsupervised Generative Feature Transformation via Graph Contrastive Pre-training and Multi-objective Fine-tuning

Ying, Wangyang, Wang, Dongjie, Hu, Xuanming, Zhou, Yuanchun, Aggarwal, Charu C., Fu, Yanjie

arXiv.org Artificial IntelligenceMay-27-2024

Feature transformation is to derive a new feature set from original features to augment the AI power of data. In many science domains such as material performance screening, while feature transformation can model material formula interactions and compositions and discover performance drivers, supervised labels are collected from expensive and lengthy experiments. This issue motivates an Unsupervised Feature Transformation Learning (UFTL) problem. Prior literature, such as manual transformation, supervised feedback guided search, and PCA, either relies on domain knowledge or expensive supervised feedback, or suffers from large search space, or overlooks non-linear feature-feature interactions. UFTL imposes a major challenge on existing methods: how to design a new unsupervised paradigm that captures complex feature interactions and avoids large search space? To fill this gap, we connect graph, contrastive, and generative learning to develop a measurement-pretrain-finetune paradigm for UFTL. For unsupervised feature set utility measurement, we propose a feature value consistency preservation perspective and develop a mean discounted cumulative gain like unsupervised metric to evaluate feature set utility. For unsupervised feature set representation pretraining, we regard a feature set as a feature-feature interaction graph, and develop an unsupervised graph contrastive learning encoder to embed feature sets into vectors. For generative transformation finetuning, we regard a feature set as a feature cross sequence and feature transformation as sequential generation. We develop a deep generative feature transformation model that coordinates the pretrained feature set encoder and the gradient information extracted from a feature set utility evaluator to optimize a transformed feature generator.

feature space, graph, transformation, (15 more...)

arXiv.org Artificial Intelligence

2405.16879

Country:

North America > United States > Kansas > Douglas County > Lawrence (0.14)
North America > United States > Arizona > Maricopa County > Tempe (0.04)
North America > United States > New York (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback