AITopics

artificial intelligence, computer vision and pattern recognition, machine learning, (19 more...)

Generative Adversarial Networks (GAN) have greatly influenced the development of computer vision and artificial intelligence in the past decade and also connected art and machine intelligence together. This book begins with a detailed introduction to the fundamental principles and historical development of GANs, contrasting them with traditional generative models and elucidating the core adversarial mechanisms through illustrative Python examples. The text systematically addresses the mathematical and theoretical underpinnings including probability theory, statistics, and game theory providing a solid framework for understanding the objectives, loss functions, and optimisation challenges inherent to GAN training. Subsequent chapters review classic variants such as Conditional GANs, DCGANs, InfoGAN, and LAPGAN before progressing to advanced training methodologies like Wasserstein GANs, GANs with gradient penalty, least squares GANs, and spectral normalisation techniques. The book further examines architectural enhancements and task-specific adaptations in generators and discriminators, showcasing practical implementations in high resolution image generation, artistic style transfer, video synthesis, text to image generation and other multimedia applications. The concluding sections offer insights into emerging research trends, including self-attention mechanisms, transformer-based generative models, and a comparative analysis with diffusion models, thus charting promising directions for future developments in both academic and applied settings.

2502.04116

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Texas (0.04)
(10 more...)

Genre:

Workflow (1.00)
Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area (1.00)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Analysis of LLM as a grammatical feature tagger for African American English

Porwal, Rahul, Rozet, Alice, Houck, Pryce, Gowda, Jotsna, Moeller, Sarah, Tang, Kevin

African American English (AAE) presents unique challenges in natural language processing (NLP). This research systematically compares the performance of available NLP models--rule-based, transformer-based, and large language models (LLMs)--capable of identifying key grammatical features of AAE, namely Habitual Be and Multiple Negation. These features were selected for their distinct grammatical complexity and frequency of occurrence. The evaluation involved sentence-level binary classification tasks, using both zero-shot and few-shot strategies. The analysis reveals that while LLMs show promise compared to the baseline, they are influenced by biases such as recency and unrelated features in the text such as formality. This study highlights the necessity for improved model training and architectural adjustments to better accommodate AAE's unique linguistic characteristics. Data and code are available.

computational linguistic, large language model, machine learning, (16 more...)

2502.06004

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Oceania > Australia (0.04)
(11 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.96)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Online Social Support Detection in Spanish Social Media Texts

Tash, Moein Shahiki, Ramos, Luis, Ahani, Zahra, Monroy, Raul, kolesnikova, Olga, Calvo, Hiram, Sidorov, Grigori

The advent of social media has transformed communication, enabling individuals to share their experiences, seek support, and participate in diverse discussions. While extensive research has focused on identifying harmful content like hate speech, the recognition and promotion of positive and supportive interactions remain largely unexplored. This study proposes an innovative approach to detecting online social support in Spanish-language social media texts. We introduce the first annotated dataset specifically created for this task, comprising 3,189 YouTube comments classified as supportive or non-supportive. To address data imbalance, we employed GPT-4o to generate paraphrased comments and create a balanced dataset. We then evaluated social support classification using traditional machine learning models, deep learning architectures, and transformer-based models, including GPT-4o, but only on the unbalanced dataset. Subsequently, we utilized a transformer model to compare the performance between the balanced and unbalanced datasets. Our findings indicate that the balanced dataset yielded improved results for Task 2 (Individual and Group) and Task 3 (Nation, Other, LGBTQ, Black Community, Women, Religion), whereas GPT-4o performed best for Task 1 (Social Support and Non-Support). This study highlights the significance of fostering a supportive online environment and lays the groundwork for future research in automated social support detection.

accuracy, f1-score, weighted f1-score, (16 more...)

2502.0964

Country:

South America (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
North America > Central America (0.04)
(2 more...)

Genre: Research Report > New Finding (0.89)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.46)
Law > Civil Rights & Constitutional Law (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Survey on Recent Progress of AI for Chemistry: Methods, Applications, and Opportunities

Hu, Ding, Hua, Pengxiang, Huang, Zhen

The development of artificial intelligence (AI) techniques has brought revolutionary changes across various realms. In particular, the use of AI-assisted methods to accelerate chemical research has become a popular and rapidly growing trend, leading to numerous groundbreaking works. In this paper, we provide a comprehensive review of current AI techniques in chemistry from a computational perspective, considering various aspects in the design of methods. We begin by discussing the characteristics of data from diverse sources, followed by an overview of various representation methods. Next, we review existing models for several topical tasks in the field, and conclude by highlighting some key challenges that warrant further attention.

molecule, prediction, representation, (16 more...)

2502.17456

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Asia > China > Shanghai > Shanghai (0.04)
Asia > China > Anhui Province > Hefei (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
Education (0.92)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Harikrishna Narasimhan, Rohit Vaish, Shivani Agarwal

On the Statistical Consistency of Plug-in Classifiers for Non-decomposable Performance Measures

Neural Information Processing SystemsFeb-8-2025, 23:38:55 GMT

We study consistency properties of algorithms for non-decomposable performance measures that cannot be expressed as a sum of losses on individual data points, such as the F-measure used in text retrieval and several other performance measures used in class imbalanced settings. While there has been much work on designing algorithms for such performance measures, there is limited understanding of the theoretical properties of these algorithms. Recently, Ye et al. (2012) showed consistency results for two algorithms that optimize the F-measure, but their results apply only to an idealized setting, where precise knowledge of the underlying probability distribution (in the form of the'true' posterior class probability) is available to a learning algorithm. In this work, we consider plug-in algorithms that learn a classifier by applying an empirically determined threshold to a suitable'estimate' of the class probability, and provide a general methodology to show consistency of these methods for any non-decomposable measure that can be expressed as a continuous function of true positive rate (TPR) and true negative rate (TNR), and for which the Bayes optimal classifier is the class probability function thresholded suitably. We use this template to derive consistency results for plug-in algorithms for the F-measure and for the geometric mean of TPR and precision; to our knowledge, these are the first such results for these measures. In addition, for continuous distributions, we show consistency of plug-in algorithms for any performance measure that is a continuous and monotonically increasing function of TPR and TNR. Experimental results confirm our theoretical findings.

artificial intelligence, machine learning, performance measure, (15 more...)

Neural Information Processing Systems

Country:

South America > Paraguay > Asunción > Asunción (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Neural Information Processing SystemsFeb-8-2025, 18:31:38 GMT

Parallel Sampling of HDPs using Sub-Cluster Splits

We develop a sampling technique for Hierarchical Dirichlet process models. The parallel algorithm builds upon [1] by proposing large split and merge moves based on learned sub-clusters. The additional global split and merge moves drastically improve convergence in the experimental results. Furthermore, we discover that cross-validation techniques do not adequately determine convergence, and that previous sampling methods converge slower than were previously expected.

algorithm, convergence, equation, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Jasper De Bock, Cassio P. de Campos, Alessandro Antonucci

Global Sensitivity Analysis for MAP Inference in Graphical Models

Neural Information Processing SystemsFeb-8-2025, 16:22:52 GMT

We study the sensitivity of a MAP configuration of a discrete probabilistic graphical model with respect to perturbations of its parameters. These perturbations are global, in the sense that simultaneous perturbations of all the parameters (or any chosen subset of them) are allowed. Our main contribution is an exact algorithm that can check whether the MAP configuration is robust with respect to given perturbations. Its complexity is essentially the same as that of obtaining the MAP configuration itself, so it can be promptly used with minimal effort. We use our algorithm to identify the largest global perturbation that does not induce a change in the MAP configuration, and we successfully apply this robustness measure in two practical scenarios: the prediction of facial action units with posed images and the classification of multiple real public data sets. A strong correlation between the proposed robustness measure and accuracy is verified in both scenarios.

artificial intelligence, instantiation, machine learning, (17 more...)

Neural Information Processing Systems

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceFeb-8-2025

Towards Trustworthy Retrieval Augmented Generation for Large Language Models: A Survey

Ni, Bo, Liu, Zheyuan, Wang, Leyao, Lei, Yongjia, Zhao, Yuying, Cheng, Xueqi, Zeng, Qingkai, Dong, Luna, Xia, Yinglong, Kenthapadi, Krishnaram, Rossi, Ryan, Dernoncourt, Franck, Tanjim, Md Mehrab, Ahmed, Nesreen, Liu, Xiaorui, Fan, Wenqi, Blasch, Erik, Wang, Yu, Jiang, Meng, Derr, Tyler

Retrieval-Augmented Generation (RAG) is an advanced technique designed to address the challenges of Artificial Intelligence-Generated Content (AIGC). By integrating context retrieval into content generation, RAG provides reliable and up-to-date external knowledge, reduces hallucinations, and ensures relevant context across a wide range of tasks. However, despite RAG's success and potential, recent studies have shown that the RAG paradigm also introduces new risks, including robustness issues, privacy concerns, adversarial attacks, and accountability issues. Addressing these risks is critical for future applications of RAG systems, as they directly impact their trustworthiness. Although various methods have been developed to improve the trustworthiness of RAG methods, there is a lack of a unified perspective and framework for research in this topic. Thus, in this paper, we aim to address this gap by providing a comprehensive roadmap for developing trustworthy RAG systems. We place our discussion around five key perspectives: reliability, privacy, safety, fairness, explainability, and accountability. For each perspective, we present a general framework and taxonomy, offering a structured approach to understanding the current challenges, evaluating existing solutions, and identifying promising future research directions. To encourage broader adoption and innovation, we also highlight the downstream applications where trustworthy RAG systems have a significant impact.

large language model, machine learning, natural language, (16 more...)

2502.06872

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Mexico > Mexico City > Mexico City (0.04)
(18 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)
Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Teoh, Jayden, Li, Wenjun, Varakantham, Pradeep

Improving Environment Novelty Quantification for Effective Unsupervised Environment Design

arXiv.org Machine LearningFeb-8-2025

Unsupervised Environment Design (UED) formalizes the problem of autocurricula through interactive training between a teacher agent and a student agent. The teacher generates new training environments with high learning potential, curating an adaptive curriculum that strengthens the student's ability to handle unseen scenarios. Existing UED methods mainly rely on regret, a metric that measures the difference between the agent's optimal and actual performance, to guide curriculum design. Regret-driven methods generate curricula that progressively increase environment complexity for the student but overlook environment novelty -- a critical element for enhancing an agent's generalizability. Measuring environment novelty is especially challenging due to the underspecified nature of environment parameters in UED, and existing approaches face significant limitations. To address this, this paper introduces the Coverage-based Evaluation of Novelty In Environment (CENIE) framework. CENIE proposes a scalable, domain-agnostic, and curriculum-aware approach to quantifying environment novelty by leveraging the student's state-action space coverage from previous curriculum experiences. We then propose an implementation of CENIE that models this coverage and measures environment novelty using Gaussian Mixture Models. By integrating both regret and novelty as complementary objectives for curriculum design, CENIE facilitates effective exploration across the state-action space while progressively increasing curriculum complexity. Empirical evaluations demonstrate that augmenting existing regret-based UED algorithms with CENIE achieves state-of-the-art performance across multiple benchmarks, underscoring the effectiveness of novelty-driven autocurricula for robust generalization.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Machine Learning

2502.05726

Country:

Asia > Singapore (0.04)
North America > United States (0.04)
South America > Brazil (0.04)
(18 more...)

Genre: Research Report > New Finding (0.92)

Industry:

Education > Curriculum (0.68)
Leisure & Entertainment > Sports > Motorsports (0.46)
Education > Educational Technology > Educational Software (0.34)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Machine LearningFeb-8-2025

Causal Learning for Heterogeneous Subgroups Based on Nonlinear Causal Kernel Clustering

Liu, Lu, Tang, Yang, Zhang, Kexuan, Sun, Qiyu

Due to the challenge posed by multi-source and heterogeneous data collected from diverse environments, causal relationships among features can exhibit variations influenced by different time spans, regions, or strategies. This diversity makes a single causal model inadequate for accurately representing complex causal relationships in all observational data, a crucial consideration in causal learning. To address this challenge, the nonlinear Causal Kernel Clustering method is introduced for heterogeneous subgroup causal learning, highlighting variations in causal relationships across diverse subgroups. The main component for clustering heterogeneous subgroups lies in the construction of the $u$-centered sample mapping function with the property of unbiased estimation, which assesses the differences in potential nonlinear causal relationships in various samples and supported by causal identifiability theory. Experimental results indicate that the method performs well in identifying heterogeneous subgroups and enhancing causal learning, leading to a reduction in prediction error.

artificial intelligence, causal relationship, machine learning, (11 more...)

arXiv.org Machine Learning

2501.11622

Country:

Indian Ocean (0.04)
Asia > China > Shanghai > Shanghai (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)