AITopics

2501.04572

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Industry: Education (0.97)

Technology:

Information Technology > Control Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.37)

Gharibi, Mirmojtaba, Clarke, John-Paul

Cluster & Disperse: a general air conflict resolution heuristic using unsupervised learning

arXiv.org Artificial IntelligenceJan-8-2025

We provide a general and malleable heuristic for the air conflict resolution problem. This heuristic is based on a new neighborhood structure for searching the solution space of trajectories and flight-levels. Using unsupervised learning, the core idea of our heuristic is to cluster the conflict points and disperse them in various flight levels. Our first algorithm is called Cluster & Disperse and in each iteration it assigns the most problematic flights in each cluster to another flight-level. In effect, we shuffle them between the flight-levels until we achieve a well-balanced configuration. The Cluster & Disperse algorithm then uses any horizontal plane conflict resolution algorithm as a subroutine to solve these well-balanced instances. Nevertheless, we develop a novel algorithm for the horizontal plane based on a similar idea. That is we cluster and disperse the conflict points spatially in the same flight level using the gradient descent and a social force. We use a novel maneuver making flights travel on an arc instead of a straight path which is based on the aviation routine of the Radius to Fix legs. Our algorithms can handle a high density of flights within a reasonable computation time. We put their performance in context with some notable algorithms from the literature. Being a general framework, a particular strength of the Cluster & Disperse is its malleability in allowing various constraints regarding the aircraft or the environment to be integrated with ease. This is in contrast to the models for instance based on mixed integer programming.

artificial intelligence, flight, machine learning, (15 more...)

2501.04281

Country:

Europe (0.04)
North America > United States > Texas > Travis County > Austin (0.04)

Genre:

Research Report (0.50)
Overview (0.46)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

arXiv.org Machine LearningJan-8-2025

A Survey on Algorithmic Developments in Optimal Transport Problem with Applications

Moradi, Sina

Optimal Transport (OT) has established itself as a robust framework for quantifying differences between distributions, with applications that span fields such as machine learning, data science, and computer vision. This paper offers a detailed examination of the OT problem, beginning with its theoretical foundations, including the classical formulations of Monge and Kantorovich and their extensions to modern computational techniques. It explores cutting-edge algorithms, including Sinkhorn iterations, primal-dual strategies, and reduction-based approaches, emphasizing their efficiency and scalability in addressing high-dimensional problems. The paper also highlights emerging trends, such as integrating OT into machine learning frameworks, the development of novel problem variants, and ongoing theoretical advancements. Applications of OT are presented across a range of domains, with particular attention to its innovative application in time series data analysis via Optimal Transport Warping (OTW), a robust alternative to methods like Dynamic Time Warping. Despite the significant progress made, challenges related to scalability, robustness, and ethical considerations remain, necessitating further research. The paper underscores OT's potential to bridge theoretical depth and practical utility, fostering impactful advancements across diverse disciplines.

algorithm, artificial intelligence, machine learning, (17 more...)

2501.06247

Country:

North America > United States (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Iran (0.04)

Genre:

Research Report (1.00)
Overview > Innovation (0.48)

Industry:

Transportation (0.94)
Health & Medicine (0.93)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

Jignasu, Anushrut, Herron, Ethan, Jiang, Zhanhong, Sarkar, Soumik, Hegde, Chinmay, Ganapathysubramanian, Baskar, Balu, Aditya, Krishnamurthy, Adarsh

STITCH: Surface reconstrucTion using Implicit neural representations with Topology Constraints and persistent Homology

arXiv.org Artificial IntelligenceJan-8-2025

We present STITCH, a novel approach for neural implicit surface reconstruction of a sparse and irregularly spaced point cloud while enforcing topological constraints (such as having a single connected component). We develop a new differentiable framework based on persistent homology to formulate topological loss terms that enforce the prior of a single 2-manifold object. Our method demonstrates excellent performance in preserving the topology of complex 3D geometries, evident through both visual and empirical comparisons. We supplement this with a theoretical analysis, and provably show that optimizing the loss with stochastic (sub)gradient descent leads to convergence and enables reconstructing shapes with a single connected component. Our approach showcases the integration of differentiable topological data analysis tools for implicit surface reconstruction.

point cloud, reconstruction, stitch, (12 more...)

2412.18696

Country:

North America > United States > New York (0.04)
North America > United States > Iowa (0.04)
Europe > France (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Minh, Tâm Le, Arbel, Julyan, Möllenhoff, Thomas, Khan, Mohammad Emtiyaz, Forbes, Florence

Natural Variational Annealing for Multimodal Optimization

arXiv.org Machine LearningJan-8-2025

We introduce a new multimodal optimization approach called Natural Variational Annealing (NVA) that combines the strengths of three foundational concepts to simultaneously search for multiple global and local modes of black-box nonconvex objectives. First, it implements a simultaneous search by using variational posteriors, such as, mixtures of Gaussians. Second, it applies annealing to gradually trade off exploration for exploitation. Finally, it learns the variational search distribution using natural-gradient learning where updates resemble well-known and easy-to-implement algorithms. The three concepts come together in NVA giving rise to new algorithms and also allowing us to incorporate "fitness shaping", a core concept from evolutionary algorithms. We assess the quality of search on simulations and compare them to methods using gradient descent and evolution strategies. We also provide an application to a real-world inverse problem in planetary science.

algorithm, gaussian mixture, gradient, (16 more...)

2501.04667

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Oceania > Australia (0.04)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)

Genre: Research Report > Promising Solution (0.45)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

Shoshani, Itamar, Shamir, Ohad

Hardness of Learning Fixed Parities with Neural Networks

arXiv.org Machine LearningJan-8-2025

Learning parity functions is a canonical problem in learning theory, which although computationally tractable, is not amenable to standard learning algorithms such as gradient-based methods. This hardness is usually explained via statistical query lower bounds [Kearns, 1998]. However, these bounds only imply that for any given algorithm, there is some worst-case parity function that will be hard to learn. Thus, they do not explain why fixed parities - say, the full parity function over all coordinates - are difficult to learn in practice, at least with standard predictors and gradient-based methods [Abbe and Boix-Adsera, 2022]. In this paper, we address this open problem, by showing that for any fixed parity of some minimal size, using it as a target function to train one-hidden-layer ReLU networks with perturbed gradient descent will fail to produce anything meaningful. To establish this, we prove a new result about the decay of the Fourier coefficients of linear threshold (or weighted majority) functions, which may be of independent interest.

algorithm, fourier coefficient, parity function, (13 more...)

2501.00817

Country:

Asia > Middle East > Jordan (0.04)
South America > Brazil > São Paulo (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.89)

Bao, Ziyuan, Kanjo, Eiman, Banerjee, Soumya, Rashid, Hasib-Al, Mohsenin, Tinoosh

Decentralised Resource Sharing in TinyML: Wireless Bilayer Gossip Parallel SGD for Collaborative Learning

arXiv.org Artificial IntelligenceJan-8-2025

With the growing computational capabilities of microcontroller units (MCUs), edge devices can now support machine learning models. However, deploying decentralised federated learning (DFL) on such devices presents key challenges, including intermittent connectivity, limited communication range, and dynamic network topologies. This paper proposes a novel framework, bilayer Gossip Decentralised Parallel Stochastic Gradient Descent (GD PSGD), designed to address these issues in resource-constrained environments. The framework incorporates a hierarchical communication structure using Distributed Kmeans (DKmeans) clustering for geographic grouping and a gossip protocol for efficient model aggregation across two layers: intra-cluster and inter-cluster. We evaluate the framework's performance against the Centralised Federated Learning (CFL) baseline using the MCUNet model on the CIFAR-10 dataset under IID and Non-IID conditions. Results demonstrate that the proposed method achieves comparable accuracy to CFL on IID datasets, requiring only 1.8 additional rounds for convergence. On Non-IID datasets, the accuracy loss remains under 8\% for moderate data imbalance. These findings highlight the framework's potential to support scalable and privacy-preserving learning on edge devices with minimal performance trade-offs.

artificial intelligence, convergence, machine learning, (16 more...)

2501.04817

Country: North America > United States > Maryland (0.67)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

arXiv.org Artificial IntelligenceJan-7-2025

A precise asymptotic analysis of learning diffusion models: theory and insights

Cui, Hugo, Pehlevan, Cengiz, Lu, Yue M.

In this manuscript, we consider the problem of learning a flow or diffusion-based generative model parametrized by a two-layer auto-encoder, trained with online stochastic gradient descent, on a high-dimensional target density with an underlying low-dimensional manifold structure. We derive a tight asymptotic characterization of low-dimensional projections of the distribution of samples generated by the learned model, ascertaining in particular its dependence on the number of training samples. Building on this analysis, we discuss how mode collapse can arise, and lead to model collapse when the generative model is re-trained on generated synthetic data.

artificial intelligence, arxiv preprint arxiv, machine learning, (15 more...)

2501.03937

Country: Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Boon, Marcus N., Cassola, Lorenzo, Euler, Hans-Christian Ruiz, Chen, Tao, van de Ven, Bram, Ibarra, Unai Alegre, Bobbert, Peter A., van der Wiel, Wilfred G.

Gradient descent in materia through homodyne gradient extraction

arXiv.org Artificial IntelligenceJan-7-2025

Deep learning, a multi-layered neural network approach inspired by the brain, has revolutionized machine learning. One of its key enablers has been backpropagation, an algorithm that computes the gradient of a loss function with respect to the weights and biases in the neural network model, in combination with its use in gradient descent. However, the implementation of deep learning in digital computers is intrinsically energy hungry, with energy consumption becoming prohibitively high for many applications. This has stimulated the development of specialized hardware, ranging from neuromorphic CMOS integrated circuits and integrated photonic tensor cores to unconventional, material-based computing system. The learning process in these material systems, realized, e.g., by artificial evolution, equilibrium propagation or surrogate modelling, is a complicated and time-consuming process. Here, we demonstrate a simple yet efficient and accurate gradient extraction method, based on the principle of homodyne detection, for performing gradient descent on a loss function directly in a physical system without the need of an analytical description. By perturbing the parameters that need to be optimized using sinusoidal waveforms with distinct frequencies, we effectively obtain the gradient information in a highly robust and scalable manner. We illustrate the method in dopant network processing units, but argue that it is applicable in a wide range of physical systems. Homodyne gradient extraction can in principle be fully implemented in materia, facilitating the development of autonomously learning material systems.

derivative, frequency, perturbation, (17 more...)

2105.11233

Country:

North America > United States (0.28)
Europe > Netherlands > North Brabant > Eindhoven (0.04)
Europe > Germany > Berlin (0.04)
Europe > Germany > North Rhine-Westphalia > Münster Region > Münster (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.66)
Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)

Bravo, Mario, Flores-Mella, Juan P., Guzmán, Cristóbal

Mixing Times and Privacy Analysis for the Projected Langevin Algorithm under a Modulus of Continuity

arXiv.org Machine LearningJan-7-2025

We study the mixing time of the projected Langevin algorithm (LA) and the privacy curve of noisy Stochastic Gradient Descent (SGD), beyond nonexpansive iterations. Specifically, we derive new mixing time bounds for the projected LA which are, in some important cases, dimension-free and poly-logarithmic on the accuracy, closely matching the existing results in the smooth convex case. Additionally, we establish new upper bounds for the privacy curve of the subsampled noisy SGD algorithm. These bounds show a crucial dependency on the regularity of gradients, and are useful for a wide range of convex losses beyond the smooth case. Our analysis relies on a suitable extension of the Privacy Amplification by Iteration (PABI) framework (Feldman et al., 2018; Altschuler and Talwar, 2022, 2023) to noisy iterations whose gradient map is not necessarily nonexpansive. This extension is achieved by designing an optimization problem which accounts for the best possible R\'enyi divergence bound obtained by an application of PABI, where the tractability of the problem is crucially related to the modulus of continuity of the associated gradient mapping. We show that, in several interesting cases -- including the nonsmooth convex, weakly smooth and (strongly) dissipative -- such optimization problem can be solved exactly and explicitly. This yields the tightest possible PABI-based bounds, where our results are either new or substantially sharper than those in previous works.

continuity, divergence, iteration, (15 more...)

2501.04134

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)