AITopics

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Neural Information Processing SystemsOct-10-2025, 12:18:27 GMT

How JEPA A voids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks

Two competing paradigms exist for self-supervised learning of data representations.

jep, objective, representation, (16 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Farghdani, Sahand, Abdelrahman, Omar, Chhabra, Robin

Fast and Modular Whole-Body Lagrangian Dynamics of Legged Robots with Changing Morphology

arXiv.org Artificial IntelligenceApr-24-2025

Fast and modular modeling of multi-legged robots (MLRs) is essential for resilient control, particularly under significant morphological changes caused by mechanical damage. Conventional fixed-structure models, often developed with simplifying assumptions for nominal gaits, lack the flexibility to adapt to such scenarios. To address this, we propose a fast modular whole-body modeling framework using Boltzmann-Hamel equations and screw theory, in which each leg's dynamics is modeled independently and assembled based on the current robot morphology. This singularity-free, closed-form formulation enables efficient design of model-based controllers and damage identification algorithms. Its modularity allows autonomous adaptation to various damage configurations without manual re-derivation or retraining of neural networks. We validate the proposed framework using a custom simulation engine that integrates contact dynamics, a gait generator, and local leg control. Comparative simulations against hardware tests on a hexapod robot with multiple leg damage confirm the model's accuracy and adaptability. Additionally, runtime analyses reveal that the proposed model is approximately three times faster than real-time, making it suitable for real-time applications in damage identification and recovery.

artificial intelligence, equation, robot, (17 more...)

2504.16383

Country: North America > Canada > Ontario (0.28)

Genre: Research Report (0.63)

Technology: Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)

arXiv.org Artificial IntelligenceOct-2-2024

Quantum-data-driven dynamical transition in quantum learning

Zhang, Bingzhi, Liu, Junyu, Jiang, Liang, Zhuang, Quntao

Quantum circuits are an essential ingredient of quantum information processing. Parameterized quantum circuits optimized under a specific cost function -- quantum neural networks (QNNs) -- provide a paradigm for achieving quantum advantage in the near term. Understanding QNN training dynamics is crucial for optimizing their performance. In terms of supervised learning tasks such as classification and regression for large datasets, the role of quantum data in QNN training dynamics remains unclear. We reveal a quantum-data-driven dynamical transition, where the target value and data determine the polynomial or exponential convergence of the training. We analytically derive the complete classification of fixed points from the dynamical equation and reveal a comprehensive `phase diagram' featuring seven distinct dynamics. These dynamics originate from a bifurcation transition with multiple codimensions induced by training data, extending the transcritical bifurcation in simple optimization tasks. Furthermore, perturbative analyses identify an exponential convergence class and a polynomial convergence class among the seven dynamics. We provide a non-perturbative theory to explain the transition via generalized restricted Haar ensemble. The analytical results are confirmed with numerical simulations of QNN training and experimental verification on IBM quantum devices. As the QNN training dynamics is determined by the choice of the target value, our findings provide guidance on constructing the cost function to optimize the speed of convergence.

convergence, convergence class, ensemble, (14 more...)

2410.01955

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology (0.48)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.34)

arXiv.org Artificial IntelligenceJul-3-2024

How JEPA Avoids Noisy Features: The Implicit Bias of Deep Linear Self Distillation Networks

Littwin, Etai, Saremi, Omid, Advani, Madhu, Thilak, Vimal, Nakkiran, Preetum, Huang, Chen, Susskind, Joshua

Two competing paradigms exist for self-supervised learning of data representations. Joint Embedding Predictive Architecture (JEPA) is a class of architectures in which semantically similar inputs are encoded into representations that are predictive of each other. A recent successful approach that falls under the JEPA framework is self-distillation, where an online encoder is trained to predict the output of the target encoder, sometimes using a lightweight predictor network. This is contrasted with the Masked AutoEncoder (MAE) paradigm, where an encoder and decoder are trained to reconstruct missing parts of the input in the data space rather, than its latent representation. A common motivation for using the JEPA approach over MAE is that the JEPA objective prioritizes abstract features over fine-grained pixel information (which can be unpredictable and uninformative). In this work, we seek to understand the mechanism behind this empirical observation by analyzing the training dynamics of deep linear models. We uncover a surprising mechanism: in a simplified linear setting where both approaches learn similar representations, JEPAs are biased to learn high-influence features, i.e., features characterized by having high regression coefficients. Our results point to a distinct implicit bias of predicting in latent space that may shed light on its success in practice.

artificial intelligence, machine learning, objective, (16 more...)

2407.03475

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Neural Information Processing SystemsMar-3-2024, 06:01:36 GMT

Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems

The optimization step in many machine learning problems rarely relies on vanilla gradient descent but it is common practice to use momentum-based accelerated methods. Despite these algorithms being widely applied to arbitrary loss functions, their behaviour in generically non-convex, high dimensional landscapes is poorly understood. In this work, we use dynamical mean field theory techniques to describe analytically the average dynamics of these methods in a prototypical non-convex model: the (spiked) matrix-tensor model. We derive a closed set of equations that describe the behaviour of heavy-ball momentum and Nesterov acceleration in the infinite dimensional limit. By numerical integration of these equations, we observe that these methods speed up the dynamics but do not improve the algorithmic threshold with respect to gradient descent in the spiked model.

algorithm, equation, gradient descent, (13 more...)

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)

Zou, Wenxuan, Huang, Haiping

Introduction to dynamical mean-field theory of randomly connected neural networks with bidirectionally correlated couplings

arXiv.org Artificial IntelligenceOct-7-2023

Dynamical mean-field theory is a powerful physics tool used to analyze the typical behavior of neural networks, where neurons can be recurrently connected, or multiple layers of neurons can be stacked. However, it is not easy for beginners to access the essence of this tool and the underlying physics. Here, we give a pedagogical introduction of this method in a particular example of random neural networks, where neurons are randomly and fully connected by correlated synapses and therefore the network exhibits rich emergent collective dynamics. We also review related past and recent important works applying this tool. In addition, a physically transparent and alternative method, namely the dynamical cavity method, is also introduced to derive exactly the same results. The numerical implementation of solving the integro-differential mean-field equations is also detailed, with an illustration of exploring the fluctuation dissipation theorem.

equation, neural network, response function, (12 more...)

2305.08459

Country:

Asia > Singapore (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsApr-6-2023, 12:17:16 GMT

Learning optimal spike-based representations

How do neural networks learn to represent information? Here, we address this question by assuming that neural networks seek to generate an optimal population representation for a fixed linear decoder. We define a loss function for the quality of the population read-out and derive the dynamical equations for both neurons and synapses from the requirement to minimize this loss. The dynamical equations yield a network of integrate-and-fire neurons undergoing Hebbian plasticity. We show that, through learning, initially regular and highly correlated spike trains evolve towards Poisson-distributed and independent spike trains with much lower firing rates.

dynamical equation, excitation and inhibition, learning optimal spike-based representation, (3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Vargas, Danilo Vasconcellos, Foong, Tham Yik, Zhang, Heng

Dynamical Equations With Bottom-up Self-Organizing Properties Learn Accurate Dynamical Hierarchies Without Any Loss Function

arXiv.org Artificial IntelligenceFeb-4-2023

Self-organization is ubiquitous in nature and mind. However, machine learning and theories of cognition still barely touch the subject. The hurdle is that general patterns are difficult to define in terms of dynamical equations and designing a system that could learn by reordering itself is still to be seen. Here, we propose a learning system, where patterns are defined within the realm of nonlinear dynamics with positive and negative feedback loops, allowing attractor-repeller pairs to emerge for each pattern observed. Experiments reveal that such a system can map temporal to spatial correlation, enabling hierarchical structures to be learned from sequential data. The results are accurate enough to surpass state-of-the-art unsupervised learning algorithms in seven out of eight experiments as well as two real-world problems. Interestingly, the dynamic nature of the system makes it inherently adaptive, giving rise to phenomena similar to phase transitions in chemistry/thermodynamics when the input structure changes. Thus, the work here sheds light on how self-organization can allow for pattern recognition and hints at how intelligent behavior might emerge from simple dynamic equations without any objective/loss function.

artificial intelligence, hierarchical tsfmap, machine learning, (15 more...)

2302.0214

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)