AITopics

2506.14146

Country: Asia > Singapore (0.14)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceJun-18-2025

An Interdisciplinary Review of Commonsense Reasoning and Intent Detection

Sakib, Md Nazmus

This review explores recent advances in commonsense reasoning and intent detection, two key challenges in natural language understanding. We analyze 28 papers from ACL, EMNLP, and CHI (2020-2025), organizing them by methodology and application. Commonsense reasoning is reviewed across zero-shot learning, cultural adaptation, structured evaluation, and interactive contexts. Intent detection is examined through open-set models, generative formulations, clustering, and human-centered systems. By bridging insights from NLP and HCI, we highlight emerging trends toward more adaptive, multilingual, and context-aware models, and identify key gaps in grounding, generalization, and benchmark design.

artificial intelligence, computational linguistic, natural language, (14 more...)

2506.1404

Country:

Asia (0.94)
North America > United States > Maryland (0.28)

Genre: Overview (1.00)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

arXiv.org Artificial IntelligenceJun-18-2025

Machine Mirages: Defining the Undefined

Tembine, Hamidou

As multimodal machine intelligence systems started achieving average animal-level and average human-level fluency in many measurable tasks in processing images, language, and sound, they began to exhibit a new class of cognitive aberrations: machine mirages. These include delusion, illusion, confabulation, hallucination, misattribution error, semantic drift, semantic compression, exaggeration, causal inference failure, uncanny valley of perception, bluffing-patter-bullshitting, cognitive stereotypy, pragmatic misunderstanding, hypersignification, semantic reheating-warming, simulated authority effect, fallacious abductive leap, contextual drift, referential hallucination, semiotic Frankenstein effect, calibration failure, spurious correlation, bias amplification, concept drift sensitivity, misclassification under uncertainty, adversarial vulnerability, overfitting, prosodic misclassification, accent bias, turn boundary failure, semantic boundary confusion, noise overfitting, latency-induced decision drift, ambiguity collapse and other forms of error that mimic but do not replicate human or animal fallibility. This article presents some of the errors and argues that these failures must be explicitly defined and systematically assessed. Understanding machine mirages is essential not only for improving machine intelligence reliability but also for constructing a multiscale ethical, co-evolving intelligence ecosystem that respects the diverse forms of life, cognition, and expression it will inevitably touch.

artificial intelligence, deep learning, machine learning, (18 more...)

2506.1399

Country: North America > United States (1.00)

Genre:

Summary/Review (1.00)
Research Report (1.00)
Overview (0.93)

Industry:

Government (0.93)
Law (0.68)
Health & Medicine > Diagnostic Medicine (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceJun-17-2025

Forecast-Then-Optimize Deep Learning Methods

Jiang, Jinhang, Wu, Nan, Liu, Ben, Feng, Mei, Ji, Xin, Srinivasan, Karthik

Time series forecasting underpins vital decision-making across various sectors, yet raw predictions from sophisticated models often harbor systematic errors and biases. We examine the Forecast-Then-Optimize (FTO) framework, pioneering its systematic synopsis. Unlike conventional Predict-Then-Optimize (PTO) methods, FTO explicitly refines forecasts through optimization techniques such as ensemble methods, meta-learners, and uncertainty adjustments. Furthermore, deep learning and large language models have established superiority over traditional parametric forecasting models for most enterprise applications. This paper surveys significant advancements from 2016 to 2025, analyzing mainstream deep learning FTO architectures. Focusing on real-world applications in operations management, we demonstrate FTO's crucial role in enhancing predictive accuracy, robustness, and decision efficacy. Our study establishes foundational guidelines for future forecasting methodologies, bridging theory and operational practicality.

artificial intelligence, forecasting, machine learning, (14 more...)

2506.13036

Country:

North America > United States > Kansas (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine (1.00)
Transportation (0.93)
Energy > Power Industry (0.92)
(2 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-17-2025

Antibody Foundational Model : Ab-RoBERTa

Huh, Eunna, Lee, Hyeonsu, Shin, Hyunjin

With the growing prominence of antibody - based therapeutics, antibody engineering has gained increasing attention as a critical area of research and development. Recent progress in transformer - based protein large language models (LLMs) has demonstrated prom ising applications in protein sequence design and structural prediction. Moreover, the availability of large - scale antibody datasets such as the Observed Antibody Space (OAS) database has opened new avenues for the development of LLMs specialized for proce ssing antibody sequences . Among these, RoBERTa has demonstrated improved performance relative to BERT, while maintaining a smaller parameter count (125M) compared to the BERT - based protein model, ProtBERT (420M). This reduced model size enables more efficient deployment in antibody - related application s . However, despite the numerous advantages of the RoBERTa architecture, antibody - specific foundational models built upon it have remained inaccessible to the research community. In this study, we introduce Ab - RoBERTa, a RoBERTa - based antibody - specific LLM, which is publicly available at https://huggingface.co/mogam - ai/Ab - RoBERTa . This resource is intended to support a wide range of antibody - related research applications including paratope prediction or humanness assessment .

large language model, machine learning, natural language, (18 more...)

2506.13006

Country: Asia > South Korea (0.04)

Genre:

Research Report > New Finding (0.48)
Overview > Growing Problem (0.34)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.95)

arXiv.org Artificial IntelligenceJun-17-2025

An Interdisciplinary Approach to Human-Centered Machine Translation

Carpuat, Marine, Asscher, Omri, Bali, Kalika, Bentivogli, Luisa, Blain, Frédéric, Bowker, Lynne, Choudhury, Monojit, Daumé, Hal III, Duh, Kevin, Gao, Ge, Grissom, Alvin II, Karpinska, Marzena, Khoong, Elaine C., Lewis, William D., Martins, André F. T., Nurminen, Mary, Oard, Douglas W., Popovic, Maja, Simard, Michel, Yvon, François

Machine Translation (MT) tools are widely used today, often in contexts where professional translators are not present. Despite progress in MT technology, a gap persists between system development and real-world usage, particularly for non-expert users who may struggle to assess translation reliability. This paper advocates for a human-centered approach to MT, emphasizing the alignment of system design with diverse communicative goals and contexts of use. We survey the literature in Translation Studies and Human-Computer Interaction to recontextualize MT evaluation and design to address the diverse real-world scenarios in which MT is used today.

machine learning, natural language, translation, (16 more...)

2506.13468

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(42 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.46)

Industry:

Law (0.93)
Health & Medicine > Therapeutic Area (0.68)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Krupakar, Hans, A, Kandappan V

A Review of the Long Horizon Forecasting Problem in Time Series Analysis

The long horizon forecasting (LHF) problem has come up in the time series literature for over the last 35 years or so. This review covers aspects of LHF in this period and how deep learning has incorporated variants of trend, seasonality, fourier and wavelet transforms, misspecification bias reduction and bandpass filters while contributing using convolutions, residual connections, sparsity reduction, strided convolutions, attention masks, SSMs, normalization methods, low-rank approximations and gating mechanisms. We highlight time series decomposition techniques, input data preprocessing and dataset windowing schemes that improve performance. Multi-layer perceptron models, recurrent neural network hybrids, self-attention models that improve and/or address the performances of the LHF problem are described, with an emphasis on the feature space construction. Ablation studies are conducted over the ETTm2 dataset in the multivariate and univariate high useful load (HUFL) forecasting contexts, evaluated over the last 4 months of the dataset. The heatmaps of MSE averages per time step over test set series in the horizon show that there is a steady increase in the error proportionate to its length except with xLSTM and Triformer models and motivate LHF as an error propagation problem. The trained models are available here: https://bit.ly/LHFModelZoo

artificial intelligence, forecasting, machine learning, (16 more...)

2506.12809

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Asia > India > Tamil Nadu > Chennai (0.04)
Oceania > New Zealand (0.04)
(2 more...)

Genre: Overview (0.88)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

PROTOCOL: Partial Optimal Transport-enhanced Contrastive Learning for Imbalanced Multi-view Clustering

Xue, Xuqian, Lei, Yiming, Cai, Qi, Shan, Hongming, Zhang, Junping

artificial intelligence, machine learning, partial optimal transport-enhanced contrastive learning, (8 more...)

While contrastive multi-view clustering has achieved remarkable success, it implicitly assumes balanced class distribution. However, real-world multi-view data primarily exhibits class imbalance distribution. Consequently, existing methods suffer performance degradation due to their inability to perceive and model such imbalance. To address this challenge, we present the first systematic study of imbalanced multi-view clustering, focusing on two fundamental problems: i. perceiving class imbalance distribution, and ii. mitigating representation degradation of minority samples. We propose PROTOCOL, a novel PaRtial Optimal TranspOrt-enhanced COntrastive Learning framework for imbalanced multi-view clustering. First, for class imbalance perception, we map multi-view features into a consensus space and reformulate the imbalanced clustering as a partial optimal transport (POT) problem, augmented with progressive mass constraints and weighted KL divergence for class distributions. Second, we develop a POT-enhanced class-rebalanced contrastive learning at both feature and class levels, incorporating logit adjustment and class-sensitive learning to enhance minority sample representations. Extensive experiments demonstrate that PROTOCOL significantly improves clustering performance on imbalanced multi-view data, filling a critical research gap in this field.

2506.12408

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > Canada (0.04)
Asia > Singapore (0.04)
Asia > China > Shandong Province > Qingdao (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Learning Causality for Modern Machine Learning

Chen, Yongqiang

In the past decades, machine learning with Empirical Risk Minimization (ERM) has demonstrated great capability in learning and exploiting the statistical patterns from data, or even surpassing humans. Despite the success, ERM avoids the modeling of causality the way of understanding and handling changes, which is fundamental to human intelligence. When deploying models beyond the training environment, distribution shifts are everywhere. For example, an autopilot system often needs to deal with new weather conditions that have not been seen during training, An Al-aided drug discovery system needs to predict the biochemical properties of molecules with respect to new viruses such as COVID-19. It renders the problem of Out-of-Distribution (OOD) generalization challenging to conventional machine learning. In this thesis, we investigate how to incorporate and realize the causality for broader tasks in modern machine learning. In particular, we exploit the invariance implied by the principle of independent causal mechanisms (ICM), that is, the causal mechanisms generating the effects from causes do not inform or influence each other. Therefore, the conditional distribution between the target variable given its causes is invariant under distribution shifts. With the causal invariance principle, we first instantiate it to graphs -- a general data structure ubiquitous in many real-world industry and scientific applications, such as financial networks and molecules. Then, we shall see how learning the causality benefits many of the desirable properties of modern machine learning, in terms of (i) OOD generalization capability; (ii) interpretability; and (iii) robustness to adversarial attacks. Realizing the causality in machine learning, on the other hand, raises a dilemma for optimization in conventional machine learning, as it often contradicts the objective of ERM...

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2506.12226

Country:

Africa (0.13)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Oceania (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Information Technology > Security & Privacy (0.87)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.65)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

Rajaram, Sara, Cotton, R. James, Sinz, Fabian H.

Similarity as Reward Alignment: Robust and Versatile Preference-based Reinforcement Learning

Preference-based Reinforcement Learning (PbRL) entails a variety of approaches for aligning models with human intent to alleviate the burden of reward engineering. However, most previous PbRL work has not investigated the robustness to labeler errors, inevitable with labelers who are non-experts or operate under time constraints. Additionally, PbRL algorithms often target very specific settings (e.g. pairwise ranked preferences or purely offline learning). We introduce Similarity as Reward Alignment (SARA), a simple contrastive framework that is both resilient to noisy labels and adaptable to diverse feedback formats and training paradigms. SARA learns a latent representation of preferred samples and computes rewards as similarities to the learned latent. We demonstrate strong performance compared to baselines on continuous control offline RL benchmarks. We further demonstrate SARA's versatility in applications such as trajectory filtering for downstream tasks, cross-task preference transfer, and reward shaping in online learning.

large language model, machine learning, trajectory, (15 more...)

2506.12529

Country:

Europe > Austria > Vienna (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > Germany > Lower Saxony > Gottingen (0.04)
(4 more...)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)