Goto

Collaborating Authors

 aem



8171ac2c5544a5cb54ac0f38bf477af4-AuthorFeedback.pdf

Neural Information Processing Systems

We thank the reviewers (R) for their insightful comments. We address the concerns of each reviewer below. R1 & R2: Do marginal V AEs scale to high dim data? As pointed out by R1, this should not be a big problem since they are fit to one-dimensional variables. In appendix D.2, we have evaluated the approximation quality of each marginal V AEs, which is indeed very high.


Self-Reflective Generation at Test Time

Mu, Jian, Zhang, Qixin, Wang, Zhiyong, Yang, Menglin, Qiu, Shuang, Qin, Chengwei, Dai, Zhongxiang, Shu, Yao

arXiv.org Artificial Intelligence

Large language models (LLMs) increasingly solve complex reasoning tasks via long chain-of-thought, but their forward-only autoregressive generation process is fragile; early token errors can cascade, which creates a clear need for self-reflection mechanisms. However, existing self-reflection either performs revisions over full drafts or learns self-correction via expensive training, both fundamentally reactive and inefficient. To address this, we propose Self-Reflective Generation at Test Time (SRGen), a lightweight test-time framework that reflects before generating at uncertain points. During token generation, SRGen utilizes dynamic entropy thresholding to identify high-uncertainty tokens. For each identified token, it trains a specific corrective vector, which fully exploits the already generated context for a self-reflective generation to correct the token probability distribution. By retrospectively analyzing the partial output, this self-reflection enables more trustworthy decisions, thereby significantly reducing the probability of errors at highly uncertain points. Evaluated on challenging mathematical reasoning benchmarks and a diverse set of LLMs, SRGen can consistently strengthen model reasoning: improvements in single-pass quality also translate into stronger self-consistency voting. The ability to execute complex multi-step reasoning remains a central frontier in advancing large language models (LLMs). LLMs generate step-by-step reasoning traces, often called chain-of-thought (CoT) (Wei et al., 2022). This capability has enabled substantial progress in mathematics, program synthesis, and other domains (Y ao et al., 2023; Plaat et al., 2024). The fidelity of these traces often determines whether the final answer is correct (Paul et al., 2024; Hammoud et al., 2025). Thus, improving the reliability of the reasoning process is critical to realizing the full potential of LLMs.


Attack Tree Analysis for Adversarial Evasion Attacks

Yamaguchi, Yuki, Aoki, Toshiaki

arXiv.org Artificial Intelligence

Abstract--Recently, the evolution of deep learning has promoted the application of machine learning (ML) to various systems. However, there are ML systems, such as autonomous vehicles, that cause critical damage when they misclassify. Conversely, there are ML-specific attacks called adversarial attacks based on the characteristics of ML systems. For example, one type of adversarial attack is an evasion attack, which uses minute perturbations called "adversarial examples" to intentionally misclassify classifiers. Therefore, it is necessary to analyze the risk of ML-specific attacks in introducing ML base systems. In this study, we propose a quantitative evaluation method for analyzing the risk of evasion attacks using attack trees. The proposed method consists of the extension of the conventional attack tree to analyze evasion attacks and the systematic construction method of the extension. In the extension of the conventional attack tree, we introduce ML and conventional attack nodes to represent various characteristics of evasion attacks. In the systematic construction process, we propose a procedure to construct the attack tree. The procedure consists of three steps: (1) organizing information about attack methods in the literature to a matrix, (2) identifying evasion attack scenarios from methods in the matrix, and (3) constructing the attack tree from the identified scenarios Figure 1: Evasion attack using physical adversarial examples made the using a pattern. Finally, we conducted experiments on three ML Tesla autopilot function change to an opposing lane in the image recognition systems to demonstrate the versatility and experiment [2]. An attack tree has various methods I. I Several ML systems Let us consider analyzing evasion attacks using conventional are safety-critical such as autonomous driving. Analysts set leaf nodes to determine the some ML-specific vulnerabilities result from the characteristics probability that the attacks succeed and compute the attributes of ML. Evasion attacks evasion attacks computes the error rate of the classifier experimentally.


The Glass Ceiling of Automatic Evaluation in Natural Language Generation

Colombo, Pierre, Peyrard, Maxime, Noiry, Nathan, West, Robert, Piantanida, Pablo

arXiv.org Artificial Intelligence

Automatic evaluation metrics capable of replacing human judgments are critical to allowing fast development of new methods. Thus, numerous research efforts have focused on crafting such metrics. In this work, we take a step back and analyze recent progress by comparing the body of existing automatic metrics and human metrics altogether. As metrics are used based on how they rank systems, we compare metrics in the space of system rankings. Our extensive statistical analysis reveals surprising findings: automatic metrics -- old and new -- are much more similar to each other than to humans. Automatic metrics are not complementary and rank systems similarly. Strikingly, human metrics predict each other much better than the combination of all automatic metrics used to predict a human metric. It is surprising because human metrics are often designed to be independent, to capture different aspects of quality, e.g. content fidelity or readability. We provide a discussion of these findings and recommendations for future work in the field of evaluation.


What's ahead in agriculture's journey toward artificial intelligence

#artificialintelligence

MILWAUKEE -- Agriculture is among the last major industries to become digitized. It's doesn't come as a major surprise, seeing as how off-road, rural environments are more challenging than roadway systems or manufacturing floors. However, as the connectivity gap continues to close, there is tremendous opportunity to capture data that can ultimately lead to transformative technologies like artificial intelligence (AI). "To put it as simply as possible, AI allows computer systems to complete tasks that are normally performed by humans," said Mark Kuehn, OEM sales manager for North America at Trimble. Given that definition, AI could mean everything from cognitive tasks like data analytics and forecasting to physical tasks like spraying weeds and picking produce.


Autoregressive Energy Machines

Nash, Charlie, Durkan, Conor

arXiv.org Machine Learning

Neural density estimators are flexible families of parametric models which have seen widespread use in unsupervised machine learning in recent years. Maximum-likelihood training typically dictates that these models be constrained to specify an explicit density. However, this limitation can be overcome by instead using a neural network to specify an energy function, or unnormalized density, which can subsequently be normalized to obtain a valid distribution. The challenge with this approach lies in accurately estimating the normalizing constant of the high-dimensional energy function. We propose the Autoregressive Energy Machine, an energy-based model which simultaneously learns an unnormalized density and computes an importance-sampling estimate of the normalizing constant for each conditional in an autoregressive decomposition. The Autoregressive Energy Machine achieves state-of-the-art performance on a suite of density-estimation tasks.


The Ways Artificial Intelligence Will Change Construction

#artificialintelligence

Artificial intelligence (AI) in the construction industry has the potential to boost productivity, safety, and other aspects of business success, according to an article published by the Association of Equipment Manufacturers (AEM). An AI system can enable such services as predictive maintenance, which multiplies the value of the Internet of Things (IoT). "With AI, users can learn patterns that lead to failures and make predictions such as construction equipment failing if it is not serviced after a certain amount of time," says Maciej Kranz, IoT expert and VP of strategic innovation at Cisco. "The AI system might also recommend how to operate the equipment to maximize its useful life, offering trade-offs between performance and longevity." Machine learning makes the analytics systems "smarter" as time goes on and more data sets and patterns are available.