AITopics | ano

Collaborating Authors

ano

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Stabilizing black-box algorithms through task-oriented randomization

Wang, Yali, Wang, Zhaojun

arXiv.org Machine LearningJun-25-2026

Abstract--As black-box models become foundational to mod-solution that can be applied across a wide range of scientific ern research, ensuring their stability is paramount for the realiza-and industrial domains. The inherent diversity of inputs--ranging from structured Gaussian distributions to Notwithstanding its widespread application, the framework complex data with unknown structures--poses a significantexhibits certain shortcomings when dealing with complex challenge: how to stabilize black-box outputs while effectivelydatasets. First, standard resampling schemes often fail to leveraging available prior information. This paper introduces aaccount for the underlying data structures; as a result, the task-oriented randomization methodology that adaptively tailorsdrawn samples cannot reflect the true data distribution, thereby its strategy to the underlying generative mechanisms of the input data, specifically addressing unstructured complexities. Second, effective sampling requires prior comprehensive suite of stability guarantees is proposed. Beyondknowledge of the distribution, which is often unattainable establishing rigorous theoretical foundations for stability, thein practical environments.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2606.25269

Genre:

Research Report (1.00)
Overview (0.68)

Industry: Transportation > Air (0.83)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback

Pay Attention to Small Weights

Zhou, Chao, Jacobs, Tom, Gadhikar, Advait, Burkholz, Rebekka

arXiv.org Artificial IntelligenceOct-23-2025

Finetuning large pretrained neural networks is known to be resource-intensive, both in terms of memory and computational cost. To mitigate this, a common approach is to restrict training to a subset of the model parameters. By analyzing the relationship between gradients and weights during finetuning, we observe a notable pattern: large gradients are often associated with small-magnitude weights. This correlation is more pronounced in finetuning settings than in training from scratch. Motivated by this observation, we propose NANOADAM, which dynamically updates only the small-magnitude weights during finetuning and offers several practical advantages: first, this criterion is gradient-free -- the parameter subset can be determined without gradient computation; second, it preserves large-magnitude weights, which are likely to encode critical features learned during pretraining, thereby reducing the risk of catastrophic forgetting; thirdly, it permits the use of larger learning rates and consistently leads to better generalization performance in experiments. We demonstrate this for both NLP and vision tasks.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2506.21374

Country: Europe (0.28)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
(2 more...)

Add feedback

Quantum Reinforcement Learning by Adaptive Non-local Observables

Lin, Hsin-Yi, Chen, Samuel Yen-Chi, Tseng, Huan-Hsin, Yoo, Shinjae

arXiv.org Artificial IntelligenceJul-29-2025

Hybrid quantum-classical frameworks leverage quantum computing for machine learning; however, variational quantum circuits (VQCs) are limited by the need for local measurements. We introduce an adaptive non-local observable (ANO) paradigm within VQCs for quantum reinforcement learning (QRL), jointly optimizing circuit parameters and multi-qubit measurements. The ANO-VQC architecture serves as the function approximator in Deep Q-Network (DQN) and Asynchronous Advantage Actor-Critic (A3C) algorithms. On multiple benchmark tasks, ANO-VQC agents outperform baseline VQCs. Ablation studies reveal that adaptive measurements enhance the function space without increasing circuit depth. Our results demonstrate that adaptive multi-qubit observables can enable practical quantum advantages in reinforcement learning.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2507.19629

Country: North America > United States (0.29)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Adaptive Non-local Observable on Quantum Neural Networks

Lin, Hsin-Yi, Tseng, Huan-Hsin, Chen, Samuel Yen-Chi, Yoo, Shinjae

arXiv.org Artificial IntelligenceJul-15-2025

--Conventional V ariational Quantum Circuits (VQCs) for Quantum Machine Learning typically rely on a fixed Her-mitian observable, often built from Pauli operators. Inspired by the Heisenberg picture, we propose an adaptive non-local measurement framework that substantially increases the model complexity of the quantum circuits. Our introduction of dynamical Hermitian observables with evolving parameters shows that optimizing VQC rotations corresponds to tracing a trajectory in the observable space. This viewpoint reveals that standard VQCs are merely a special case of the Heisenberg representation. Furthermore, we show that properly incorporating variational rotations with non-local observables enhances qubit interaction and information mixture, admitting flexible circuit designs. Two non-local measurement schemes are introduced, and numerical simulations on classification tasks confirm that our approach outperforms conventional VQCs, yielding a more powerful and resource-efficient approach as a Quantum Neural Network. Quantum Machine Learning (QML) is a developing field that leverages the principles of quantum mechanics to advance machine learning (ML) models. With the rapid advancement of quantum computing hardware, QML aims to exploit quantum phenomena--such as superposition, entanglement, and quantum interference--to provide computational advantages over classical approaches. Despite the current limitations of quantum hardware, hybrid quantum-classical algorithms have been developed to harness the strengths of both computing paradigms, allowing near-term quantum devices to contribute meaningfully to real-world ML tasks.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2504.13414

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control

Fan, Xiang, Lyu, Yiwei, Liang, Paul Pu, Salakhutdinov, Ruslan, Morency, Louis-Philippe

arXiv.org Artificial IntelligenceSep-22-2023

Pretrained language models have demonstrated extraordinary capabilities in language generation. However, real-world tasks often require controlling the distribution of generated text in order to mitigate bias, promote fairness, and achieve personalization. Existing techniques for controlling the distribution of generated text only work with quantified distributions, which require pre-defined categories, proportions of the distribution, or an existing corpus following the desired distributions. However, many important distributions, such as personal preferences, are unquantified. In this work, we tackle the problem of generating text following arbitrary distributions (quantified and unquantified) by proposing Nano, a few-shot human-in-the-loop training algorithm that continuously learns from human feedback. Nano achieves state-of-the-art results on single topic/attribute as well as quantified distribution control compared to previous works. We also show that Nano is able to learn unquantified distributions, achieves personalization, and captures differences between different individuals' personal preferences with high sample efficiency.

annotator, ano, language model, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.findings-acl.758

2211.0575

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Iowa (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
(16 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Rockmate: an Efficient, Fast, Automatic and Generic Tool for Re-materialization in PyTorch

Zhao, Xunyi, Hellard, Théotime Le, Eyraud, Lionel, Gusak, Julia, Beaumont, Olivier

arXiv.org Artificial IntelligenceJul-3-2023

We propose Rockmate to control the memory requirements when training PyTorch DNN models. Rockmate is an automatic tool that starts from the model code and generates an equivalent model, using a predefined amount of memory for activations, at the cost of a few re-computations. Rockmate automatically detects the structure of computational and data dependencies and rewrites the initial model as a sequence of complex blocks. We show that such a structure is widespread and can be found in many models in the literature (Transformer based models, ResNet, RegNets,...). This structure allows us to solve the problem in a fast and efficient way, using an adaptation of Checkmate (too slow on the whole model but general) at the level of individual blocks and an adaptation of Rotor (fast but limited to sequential models) at the level of the sequence itself. We show through experiments on many models that Rockmate is as fast as Rotor and as efficient as Checkmate, and that it allows in many cases to obtain a significantly lower memory consumption for activations (by a factor of 2 to 5) for a rather negligible overhead (of the order of 10% to 20%). Rockmate is open source and available at https://github.com/topal-team/rockmate.

artificial intelligence, machine learning, torch, (18 more...)

arXiv.org Artificial Intelligence

2307.01236

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Root-cause Analysis for Time-series Anomalies via Spatiotemporal Graphical Modeling in Distributed Complex Systems

Liu, Chao, Lore, Kin Gwn, Jiang, Zhanhong, Sarkar, Soumik

arXiv.org Machine LearningMay-30-2018

Performance monitoring, anomaly detection, and root-cause analysis in complex cyber-physical systems (CPSs) are often highly intractable due to widely diverse operational modes, disparate data types, and complex fault propagation mechanisms. This paper presents a new data-driven framework for root-cause analysis, based on a spatiotemporal graphical modeling approach built on the concept of symbolic dynamics for discovering and representing causal interactions among sub-systems of complex CPSs. We formulate the root-cause analysis problem as a minimization problem via the proposed inference based metric and present two approximate approaches for root-cause analysis, namely the sequential state switching ($S^3$, based on free energy concept of a restricted Boltzmann machine, RBM) and artificial anomaly association ($A^3$, a classification framework using deep neural networks, DNN). Synthetic data from cases with failed pattern(s) and anomalous node(s) are simulated to validate the proposed approaches. Real dataset based on Tennessee Eastman process (TEP) is also used for comparison with other approaches. The results show that: (1) $S^3$ and $A^3$ approaches can obtain high accuracy in root-cause analysis under both pattern-based and node-based fault scenarios, in addition to successfully handling multiple nominal operating modes, (2) the proposed tool-chain is shown to be scalable while maintaining high accuracy, and (3) the proposed framework is robust and adaptive in different fault conditions and performs better in comparison with the state-of-the-art methods.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

1805.12296

Country: North America > United States > Tennessee (0.24)

Genre: Research Report (1.00)

Industry:

Energy (1.00)
Aerospace & Defense (0.67)
Materials > Chemicals (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
(2 more...)

Add feedback