AITopics | iteration count

Collaborating Authors

iteration count

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

96f2d6069db8ad895c34e2285d25c0ed-Supplemental.pdf

Neural Information Processing SystemsFeb-9-2026, 10:45:07 GMT

Smooth convex optimization problems over polytopes are an important class of problems that appear in many settings, such as low-rank matrix completion [1],structured supervised learning [2,3],electrical flowsovergraphs [4],video co-localization in computer vision [5], traffic assignment problems [6], and submodular function minimization [7].

artificial intelligence, optimization problem, xi 1, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New York (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Germany (0.04)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

Deceptron: Learned Local Inverses for Fast and Stable Physics Inversion

Kachhadiya, Aaditya L.

arXiv.org Artificial IntelligenceNov-27-2025

Inverse problems in the physical sciences are often ill-conditioned in input space, making progress step-size sensitive. We propose the Deceptron, a lightweight bidirectional module that learns a local inverse of a differentiable forward surrogate. Training combines a supervised fit, forward-reverse consistency, a lightweight spectral penalty, a soft bias tie, and a Jacobian Composition Penalty (JCP) that encourages $J_g(f(x))\,J_f(x)\!\approx\!I$ via JVP/VJP probes. At solve time, D-IPG (Deceptron Inverse-Preconditioned Gradient) takes a descent step in output space, pulls it back through $g$, and projects under the same backtracking and stopping rules as baselines. On Heat-1D initial-condition recovery and a Damped Oscillator inverse problem, D-IPG reaches a fixed normalized tolerance with $\sim$20$\times$ fewer iterations on Heat and $\sim$2-3$\times$ fewer on Oscillator than projected gradient, competitive in iterations and cost with Gauss-Newton. Diagnostics show JCP reduces a measured composition error and tracks iteration gains. We also preview a single-scale 2D instantiation, DeceptronNet (v0), that learns few-step corrections under a strict fairness protocol and exhibits notably fast convergence.

artificial intelligence, iteration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2511.21076

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

AbbIE: Autoregressive Block-Based Iterative Encoder for Efficient Sequence Modeling

Aleksandrov, Preslav, Kurmanji, Meghdad, Redondo, Fernando Garcia, O'Shea, David, Shen, William, Iacob, Alex, Sani, Lorenzo, Qiu, Xinchi, Cancedda, Nicola, Lane, Nicholas D.

arXiv.org Artificial IntelligenceAug-8-2025

We introduce the Autoregressive Block-Based Iterative Encoder (AbbIE), a novel recursive generalization of the encoder-only Transformer architecture, which achieves better perplexity than a standard Transformer and allows for the dynamic scaling of compute resources at test time. This simple, recursive approach is a complement to scaling large language model (LLM) performance through parameter and token counts. AbbIE performs its iterations in latent space, but unlike latent reasoning models, does not require a specialized dataset or training protocol. We show that AbbIE upward generalizes (ability to generalize to arbitrary iteration lengths) at test time by only using 2 iterations during train time, far outperforming alternative iterative methods. AbbIE's ability to scale its computational expenditure based on the complexity of the task gives it an up to \textbf{12\%} improvement in zero-shot in-context learning tasks versus other iterative and standard methods and up to 5\% improvement in language perplexity. The results from this study open a new avenue to Transformer performance scaling. We perform all of our evaluations on model sizes up to 350M parameters.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2507.08567

Country:

North America > United States (0.68)
Europe (0.67)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Reviews: Large-scale optimal transport map estimation using projection pursuit

Neural Information Processing SystemsJan-23-2025, 12:51:40 GMT

After Rebuttal Thank you for the response and the clarifications. We still have some concerns about the complexity of the algorithm and its dependence on the iteration count -- in particular in cases where this may scale with the dimension d based on the specified termination criteria. This being said, it would be great to comment on this in the numerical results and include more timing studies to confirm this dependence. We also strongly suggest that the authors include the extensions listed in the improvement in the final revision. Before Rebuttal The authors present a novel algorithm with both theoretical analysis and empirical results. We have a few comments and suggestions for the work: In the introduction, we recommend that the authors also note alternative versions of finding OTM that are not based on solving a linear program (including non-discrete versions of OT based on solving ODEs or finding parametrized maps).

large-scale optimal transport map estimation, lastly, projection pursuit, (10 more...)

Neural Information Processing Systems

Genre: Summary/Review (0.73)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.36)

Add feedback

A Hybrid Deep Learning and Model-Checking Framework for Accurate Brain Tumor Detection and Validation

Fatimi, Lahcen El, Elfatimi, Elhoucine, Bouchaneb, Hanifa

arXiv.org Artificial IntelligenceDec-31-2024

Model checking is an automatic technique for verifying the correctness properties of safety-critical reactive systems. This method has been successfully applied to find subtle errors in complex systems. Model checking techniques have a wide range of application domains, among which large-scale distributed systems [1-3], signal [4], and medical images analysis [5-8]. The research related to the last topic is still ongoing looking for the perfect (precise, complete, simple) approach for analyzing medical images. The use of model checking is relatively recent, in particular regarding the verification of the analysis of medical images. In this domain, model checking in medical images has shown to be a promising application that can significantly facilitate the work of professionals. What motivates us in this study, considering that model checking is increasingly used in testing to check whether a system model satisfies a property, is to take model checking in its usual role to take on more advanced roles in medical image analysis by applying model-checking logic to medical images and detection of tumors in addition to validation of properties through tests or case studies.

artificial intelligence, logic & formal reasoning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.01991

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A fast neural hybrid Newton solver adapted to implicit methods for nonlinear dynamics

Jin, Tianyu, Maierhofer, Georg, Schratz, Katharina, Xiang, Yang

arXiv.org Artificial IntelligenceJul-4-2024

The use of implicit time-stepping schemes for the numerical approximation of solutions to stiff nonlinear time-evolution equations brings well-known advantages including, typically, better stability behaviour and corresponding support of larger time steps, and better structure preservation properties. However, this comes at the price of having to solve a nonlinear equation at every time step of the numerical scheme. In this work, we propose a novel operator learning based hybrid Newton's method to accelerate this solution of the nonlinear time step system for stiff time-evolution nonlinear equations. We propose a targeted learning strategy which facilitates robust unsupervised learning in an offline phase and provides a highly efficient initialisation for the Newton iteration leading to consistent acceleration of Newton's method. A quantifiable rate of improvement in Newton's method achieved by improved initialisation is provided and we analyse the upper bound of the generalisation error of our unsupervised learning strategy. These theoretical results are supported by extensive numerical results, demonstrating the efficiency of our proposed neural hybrid solver both in one- and two-dimensional cases.

hybrid solver, iteration, solver, (17 more...)

arXiv.org Artificial Intelligence

2407.03945

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Generative modeling of Sparse Approximate Inverse Preconditioners

Li, Mou, Wang, He, Jimack, Peter K.

arXiv.org Artificial IntelligenceMay-17-2024

We present a new deep learning paradigm for the generation of sparse approximate inverse (SPAI) preconditioners for matrix systems arising from the mesh-based discretization of elliptic differential operators. Our approach is based upon the observation that matrices generated in this manner are not arbitrary, but inherit properties from differential operators that they discretize. Consequently, we seek to represent a learnable distribution of high-performance preconditioners from a low-dimensional subspace through a carefully-designed autoencoder, which is able to generate SPAI preconditioners for these systems. The concept has been implemented on a variety of finite element discretizations of second- and fourth-order elliptic partial differential equations with highly promising results.

condition number, matrix, preconditioner, (13 more...)

arXiv.org Artificial Intelligence

2405.11007

Country:

Europe > United Kingdom (0.04)
North America > United States (0.04)
Europe > Netherlands > South Holland > Dordrecht (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Non-Intrusive Load Monitoring with Missing Data Imputation Based on Tensor Decomposition

Shi, DengYu

arXiv.org Artificial IntelligenceMar-9-2024

With the widespread adoption of Non-Intrusive Load Monitoring (NILM) in building energy management, ensuring the high quality of NILM data has become imperative. However, practical applications of NILM face challenges associated with data loss, significantly impacting accuracy and reliability in energy management. This paper addresses the issue of NILM data loss by introducing an innovative tensor completion(TC) model- Proportional-Integral-Derivative (PID)-incorporated Non-negative Latent Factorization of Tensors (PNLFT) with twofold ideas: 1) To tackle the issue of slow convergence in Latent Factorization of Tensors (LFT) using Stochastic Gradient Descent (SGD), a Proportional-Integral-Derivative controller is introduced during the learning process. The PID controller utilizes historical and current information to control learning residuals. 2) Considering the characteristics of NILM data, non-negative update rules are proposed in the model's learning scheme. Experimental results on three datasets demonstrate that, compared to state-of-the-art models, the proposed model exhibits noteworthy enhancements in both convergence speed and accuracy.

factorization, iteration count, tensor, (15 more...)

arXiv.org Artificial Intelligence

2403.07012

Country:

Asia > China > Chongqing Province > Chongqing (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.48)

Industry: Energy > Power Industry (0.68)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

A Deep Conjugate Direction Method for Iteratively Solving Linear Systems

Kaneda, Ayano, Akar, Osman, Chen, Jingyu, Kala, Victoria, Hyde, David, Teran, Joseph

arXiv.org Artificial IntelligenceOct-1-2022

We present a novel deep learning approach to approximate the solution of large, sparse, symmetric, positive-definite linear systems of equations. These systems arise from many problems in applied science, e.g., in numerical methods for partial differential equations. Algorithms for approximating the solution to these systems are often the bottleneck in problems that require their solution, particularly for modern applications that require many millions of unknowns. Indeed, numerical linear algebra techniques have been investigated for many decades to alleviate this computational burden. Recently, data-driven techniques have also shown promise for these problems. Motivated by the conjugate gradients algorithm that iteratively selects search directions for minimizing the matrix norm of the approximation error, we design an approach that utilizes a deep neural network to accelerate convergence via data-driven improvement of the search directions. Our method leverages a carefully chosen convolutional network to approximate the action of the inverse of the linear operator up to an arbitrary constant. We train the network using unsupervised learning with a loss function equal to the $L^2$ difference between an input and the system matrix times the network evaluation, where the unspecified constant in the approximate inverse is accounted for. We demonstrate the efficacy of our approach on spatially discretized Poisson equations with millions of degrees of freedom arising in computational fluid dynamics applications. Unlike state-of-the-art learning approaches, our algorithm is capable of reducing the linear system residual to a given tolerance in a small number of iterations, independent of the problem size. Moreover, our method generalizes effectively to various systems beyond those encountered during training.

artificial intelligence, machine learning, matrix, (20 more...)

arXiv.org Artificial Intelligence

2205.10763

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > California > Yolo County > Davis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Tennessee > Davidson County > Nashville (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Manifold Free Riemannian Optimization

Shustin, Boris, Avron, Haim, Sober, Barak

arXiv.org Machine LearningSep-7-2022

Riemannian optimization is a principled framework for solving optimization problems where the desired optimum is constrained to a smooth manifold $\mathcal{M}$. Algorithms designed in this framework usually require some geometrical description of the manifold, which typically includes tangent spaces, retractions, and gradients of the cost function. However, in many cases, only a subset (or none at all) of these elements can be accessed due to lack of information or intractability. In this paper, we propose a novel approach that can perform approximate Riemannian optimization in such cases, where the constraining manifold is a submanifold of $\R^{D}$. At the bare minimum, our method requires only a noiseless sample set of the cost function $(\x_{i}, y_{i})\in {\mathcal{M}} \times \mathbb{R}$ and the intrinsic dimension of the manifold $\mathcal{M}$. Using the samples, and utilizing the Manifold-MLS framework (Sober and Levin 2020), we construct approximations of the missing components entertaining provable guarantees and analyze their computational costs. In case some of the components are given analytically (e.g., if the cost function and its gradient are given explicitly, or if the tangent spaces can be computed), the algorithm can be easily adapted to use the accurate expressions instead of the approximations. We analyze the global convergence of Riemannian gradient-based methods using our approach, and we demonstrate empirically the strength of this method, together with a conjugate-gradients type method based upon similar principles.

artificial intelligence, machine learning, manifold, (18 more...)

arXiv.org Machine Learning

2209.03269

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback