AITopics | tri

Collaborating Authors

tri

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Improving the Straight-Through Estimator with Zeroth-Order Information

Neural Information Processing SystemsJun-23-2026, 02:51:27 GMT

We study the problem of training neural networks with quantized parameters. Learning low-precision quantized parameters by enabling computation of gradients via the Straight-Through Estimator (STE) can be challenging. While the STE enables back-propagation, which is a first-order method, recent works have explored the use of zeroth-order (ZO) gradient descent for fine-tuning. We note that the STE provides high-quality biased gradients, and ZO gradients are unbiased but can be expensive. We thus propose First-Order-Guided Zeroth-Order Gradient Descent (FOGZO) that reduces STE bias while reducing computations relative to ZO methods. Empirically, we show FOGZO improves the tradeoff between quality and training time in Quantization-Aware Pre-Training. Specifically, versus STE at the same number of iterations, we show a 1-8% accuracy improvement for DeiTTiny/Small, 1-2% accuracy improvement on ResNet 18/50, and 1-22 perplexity point improvement for LLaMA models with up to 0.3 billion parameters. For the same loss, FOGZO yields a 796 reduction in computation versus n-SPSA for a 2-layer MLP on MNIST.

artificial intelligence, fogzo, machine learning, (19 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
Asia (0.46)
North America (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Instance Dependent Testing of Samplers using Interval Conditioning

Bhattacharyya, Rishiraj, Chakraborty, Sourav, Pote, Yash, Sarkar, Uddalok, Sen, Sayantan

arXiv.org Artificial IntelligenceDec-9-2025

Sampling algorithms play a pivotal role in probabilistic AI. However, verifying if a sampler program indeed samples from the claimed distribution is a notoriously hard problem. Provably correct testers like Barbarik, Teq, Flash, CubeProbe for testing of different kinds of samplers were proposed only in the last few years. All these testers focus on the worst-case efficiency, and do not support verification of samplers over infinite domains, a case occurring frequently in Astronomy, Finance, Network Security, etc. In this work, we design the first tester of samplers with instance-dependent efficiency, allowing us to test samplers over natural numbers. Our tests are developed via a novel distance estimation algorithm between an unknown and a known probability distribution using an interval conditioning framework. The core technical contribution is a new connection with probability mass estimation of a continuous distribution. The practical gains are also substantial: our experiments establish up to 1000x speedup over state-of-the-art testers.

artificial intelligence, probability, sampler, (13 more...)

arXiv.org Artificial Intelligence

2512.06458

Country: North America > Canada > Ontario (0.28)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Temporal-adaptive Weight Quantization for Spiking Neural Networks

Zhang, Han, Meng, Qingyan, Wang, Jiaqi, Chen, Baiyu, Ma, Zhengyu, Fan, Xiaopeng

arXiv.org Artificial IntelligenceNov-25-2025

Weight quantization in spiking neural networks (SNNs) could further reduce energy consumption. However, quantizing weights without sacrificing accuracy remains challenging. In this study, inspired by astrocyte-mediated synaptic modulation in the biological nervous systems, we propose Temporal-adaptive Weight Quantization (TaWQ), which incorporates weight quantization with temporal dynamics to adaptively allocate ultra-low-bit weights along the temporal dimension. Extensive experiments on static (e.g., ImageNet) and neuromorphic (e.g., CIFAR10-DVS) datasets demonstrate that our TaWQ maintains high energy efficiency (4.12M, 0.63mJ) while incurring a negligible quantization loss of only 0.22% on ImageNet.

artificial intelligence, machine learning, quantization, (15 more...)

arXiv.org Artificial Intelligence

2511.17567

Country:

North America > United States (1.00)
Asia (1.00)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Improving the Straight-Through Estimator with Zeroth-Order Information

Yang, Ningfeng, Aamodt, Tor M.

arXiv.org Artificial IntelligenceOct-29-2025

We study the problem of training neural networks with quantized parameters. Learning low-precision quantized parameters by enabling computation of gradients via the Straight-Through Estimator (STE) can be challenging. While the STE enables back-propagation, which is a first-order method, recent works have explored the use of zeroth-order (ZO) gradient descent for fine-tuning. We note that the STE provides high-quality biased gradients, and ZO gradients are unbiased but can be expensive. We thus propose First-Order-Guided Zeroth-Order Gradient Descent (FOGZO) that reduces STE bias while reducing computations relative to ZO methods. Empirically, we show FOGZO improves the tradeoff between quality and training time in Quantization-Aware Pre-Training. Specifically, versus STE at the same number of iterations, we show a 1-8\% accuracy improvement for DeiT Tiny/Small, 1-2\% accuracy improvement on ResNet 18/50, and 1-22 perplexity point improvement for LLaMA models with up to 0.3 billion parameters. For the same loss, FOGZO yields a 796$\times$ reduction in computation versus n-SPSA for a 2-layer MLP on MNIST. Code is available at https://github.com/1733116199/fogzo.

artificial intelligence, fogzo, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2510.23926

Country:

Europe (0.45)
North America (0.28)
Asia (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Tri-Learn Graph Fusion Network for Attributed Graph Clustering

Li, Binxiong, Xiang, Xu, Li, Xue, Zhao, Binyu, Gao, Heyang, Zhao, Qinyu

arXiv.org Artificial IntelligenceJul-23-2025

In recent years, models based on Graph Convolutional Networks (GCN) have made significant strides in the field of graph data analysis. However, challenges such as over-smoothing and over-compression remain when handling large-scale and complex graph datasets, leading to a decline in clustering quality. Although the Graph Transformer architecture has mitigated some of these issues, its performance is still limited when processing heterogeneous graph data. To address these challenges, this study proposes a novel deep clustering framework that comprising GCN, Autoencoder (AE), and Graph Transformer, termed the Tri-Learn Graph Fusion Network (Tri-GFN). This framework enhances the differentiation and consistency of global and local information through a unique tri-learning mechanism and feature fusion enhancement strategy. The framework integrates GCN, AE, and Graph Transformer modules. These components are meticulously fused by a triple-channel enhancement module, which maximizes the use of both node attributes and topological structures, ensuring robust clustering representation. The tri-learning mechanism allows mutual learning among these modules, while the feature fusion strategy enables the model to capture complex relationships, yielding highly discriminative representations for graph clustering. It surpasses many state-of-the-art methods, achieving an accuracy improvement of approximately 0.87% on the ACM dataset, 14.14 % on the Reuters dataset, and 7.58 % on the USPS dataset. Due to its outstanding performance on the Reuters dataset, Tri-GFN can be applied to automatic news classification, topic retrieval, and related fields.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2507.1362

Country:

North America > United States (1.00)
Asia > China > Sichuan Province (0.28)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.93)
Research Report > Promising Solution (0.87)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging

Islam, Md. Ashraful, Ali, Mohammed Eunus, Parvez, Md Rizwan

arXiv.org Artificial IntelligenceFeb-8-2025

Large Language Models (LLMs) have made significant strides in code generation and problem solving. Current approaches employ external tool-based iterative debuggers that use compiler or other tool-based runtime feedback to refine coarse programs generated by various methods. However, the effectiveness of these approaches heavily relies on the quality of the initial code generation, which remains an open challenge. In this paper, we introduce CodeSim, a novel multi-agent code generation framework that comprehensively addresses the stages of program synthesis-planning, coding, and debugging-through a human-like perception approach. As human verifies their understanding of any algorithms through visual simulation, CodeSim uniquely features a method of plan verification and internal debugging through the step-by-step simulation of input/output. Extensive experiments across seven challenging competitive problem-solving and program synthesis benchmarks demonstrate CodeSim's remarkable code generation capabilities. Our framework achieves new state-of-the-art (pass@1) results-(HumanEval 95.1%, MBPP 90.7%, APPS 22%, and CodeContests 29.1%). Furthermore, our method shows potential for even greater enhancement when cascaded with external debuggers. To facilitate further research and development in this area, we have open-sourced our framework in this link (https://kagnlp.github.io/codesim.github.io/).

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.05664

Country:

Asia > Thailand > Bangkok > Bangkok (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
Asia > Singapore (0.04)
(8 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

From Code to Correctness: Closing the Last Mile of Code Generation with Hierarchical Debugging

Shi, Yuling, Wang, Songsong, Wan, Chengcheng, Gu, Xiaodong

arXiv.org Artificial IntelligenceOct-5-2024

While large language models have made significant strides in code generation, the pass rate of the generated code is bottlenecked on subtle errors, often requiring human intervention to pass tests, especially for complex problems. Existing LLM-based debugging systems treat generated programs as monolithic units, failing to address bugs at multiple levels of granularity, from low-level syntax errors to high-level algorithmic flaws. In this paper, we introduce Multi-Granularity Debugger (MGDebugger), a hierarchical code debugger by isolating, identifying, and resolving bugs at various levels of granularity. MGDebugger decomposes problematic code into a hierarchical tree structure of subfunctions, with each level representing a particular granularity of error. During debugging, it analyzes each subfunction and iteratively resolves bugs in a bottom-up manner. To effectively test each subfunction, we propose an LLM-simulated Python executor, which traces code execution and tracks important variable states to pinpoint errors accurately. Extensive experiments demonstrate that MGDebugger outperforms existing debugging systems, achieving an 18.9% improvement in accuracy over seed generations in HumanEval and a 97.6% repair success rate in HumanEvalFix. Furthermore, MGDebugger effectively fixes bugs across different categories and difficulty levels, demonstrating its robustness and effectiveness.

mgdebugger, subfunction, test case, (17 more...)

arXiv.org Artificial Intelligence

2410.01215

Country:

North America > Canada > Ontario > Toronto (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(3 more...)

Genre:

Workflow (0.93)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

An action language-based formalisation of an abstract argumentation framework

Munro, Yann, Sarmiento, Camilo, Bloch, Isabelle, Bourgne, Gauvain, Pelachaud, Catherine, Lesot, Marie-Jeanne

arXiv.org Artificial IntelligenceSep-29-2024

An abstract argumentation framework is a commonly used formalism to provide a static representation of a dialogue. However, the order of enunciation of the arguments in an argumentative dialogue is very important and can affect the outcome of this dialogue. In this paper, we propose a new framework for modelling abstract argumentation graphs, a model that incorporates the order of enunciation of arguments. By taking this order into account, we have the means to deduce a unique outcome for each dialogue, called an extension. We also establish several properties, such as termination and correctness, and discuss two notions of completeness. In particular, we propose a modification of the previous transformation based on a "last enunciated last updated" strategy, which verifies the second form of completeness.

argument, dialogue, extension, (16 more...)

arXiv.org Artificial Intelligence

2409.19625

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)

Add feedback

Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation

Wang, Xinglin, Li, Yiwei, Feng, Shaoxiong, Yuan, Peiwen, Pan, Boyuan, Wang, Heda, Hu, Yao, Li, Kan

arXiv.org Artificial IntelligenceJul-2-2024

Self-consistency (SC), leveraging multiple samples from LLMs, shows significant gains on various reasoning tasks but struggles with free-form generation due to the difficulty of aggregating answers. Its variants, UCS and USC, rely on sample selection or voting mechanisms to improve output quality. These methods, however, face limitations due to their inability to fully utilize the nuanced consensus knowledge present within multiple candidate samples, often resulting in suboptimal outputs. We propose Fine-Grained Self-Consistency (FSC) to addresses these limitations by extracting and integrating segment-level commonalities from candidate samples, enhancing the performance of LLMs both in open-ended and reasoning tasks. Based on this, we present two additional strategies: candidate filtering, which enhances overall quality by identifying highly similar candidate sets, and merging, which reduces input token requirements by combining similar samples. The effectiveness of FSC is demonstrated through extensive experiments on various tasks, including summarization, code generation, and mathematical reasoning, using GPT-3.5-turbo and GPT-4. The results indicate significant improvements over baseline methods, showcasing the potential of FSC to optimize output quality by effectively synthesizing fine-grained consensus knowledge from multiple samples.

fsc, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2407.02056

Country:

Europe > United Kingdom > England > Greater London > London > Wimbledon (0.05)
North America > United States > New York (0.05)
Europe > United Kingdom > Scotland (0.04)
(9 more...)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Sports > Tennis (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Add feedback

Toyota is using AI to design better cars faster

FOX NewsJun-21-2023, 12:19:36 GMT

The 2023 Toyota Prius Prime is all-new and more stylish than any version of the model before it. Toyota is brining artificial intelligence deeper into the car design process. The automaker has developed a new algorithm that can help develop new vehicle designs quicker than traditional processes. The system allows engineers and designers to upload sketches, then use text-based prompts to ask the system to optimize them within various engineering constraints. "Generative AI tools are often used as inspiration for designers, but cannot handle the complex engineering and safety considerations that go into actual car design," explained Avinash Balachandran, director of the Toyota Research Institute's (TRI) Human Interactive Driving division.

design better car, design process, toyota, (5 more...)

FOX News

Country: North America > United States > California (0.07)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks > Manufacturer (1.00)

Technology: Information Technology > Artificial Intelligence > Applied AI (0.40)

Add feedback