AITopics | mnn

Collaborating Authors

mnn

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights

Neural Information Processing SystemsSep-30-2025, 10:26:36 GMT

Multilayer Neural Networks (MNNs) are commonly trained using gradient descent-based methods, such as BackPropagation (BP). Inference in probabilistic graphical models is often done using variational Bayes methods, such as Expectation Propagation (EP). We show how an EP based approach can also be used to train deterministic MNNs. Specifically, we approximate the posterior of the weights given the data using a "mean-field" factorized distribution, in an online setting. Using online EP and the central limit theorem we find an analytical approximation to the Bayes update of this posterior, as well as the resulting Bayes estimates of the weights and outputs. Despite a different origin, the resulting algorithm, Expectation BackPropagation (EBP), is very similar to BP in form and efficiency. However, it has several additional advantages: (1) Training is parameter-free, given initial conditions (prior) and the MNN architecture. This is useful for large-scale problems, where parameter tuning is a major challenge.

expectation backpropagation, multilayer neural network, parameter-free training, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights

Daniel Soudry, Itay Hubara, Ron Meir

Neural Information Processing SystemsFeb-8-2025, 15:55:06 GMT

artificial intelligence, machine learning, mnn, (18 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.81)

Add feedback

Combining Entropy and Matrix Nuclear Norm for Enhanced Evaluation of Language Models

Vo, James

arXiv.org Artificial IntelligenceOct-18-2024

Large Language Models (LLMs) have revolutionized the field of natural language processing (NLP) by demonstrating unprecedented capabilities in understanding and generating human-like text. Models such as GPT-4, BERT, and their successors have not only achieved remarkable performance across a variety of tasks but have also extended their utility into multi-modal domains, encompassing vision, audio, and other data types. As these models continue to grow in size and complexity, evaluating their performance accurately and efficiently becomes increasingly critical. Traditional evaluation metrics for LLMs, including perplexity, accuracy, and F1 scores, primarily focus on task-specific outcomes. While these metrics provide valuable insights into a model's ability to perform particular tasks, they often fall short in capturing the underlying representational dynamics and information compression capabilities of the models. Moreover, as LLMs scale, the computational demands of these conventional metrics can become prohibitive, necessitating the development of more sophisticated and scalable evaluation methodologies. Recent advancements have introduced novel metrics that delve deeper into the internal workings of LLMs. One such approach is Diff-eRank Wei et al. [2024], a rank-based metric grounded in information theory and geometric principles.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.1448

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Expectation Backpropagation: Parameter-Free Training of Multilayer Neural Networks with Continuous or Discrete Weights

Daniel Soudry, Itay Hubara, Ron Meir

Neural Information Processing SystemsOct-6-2024, 10:57:20 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, approximation, mnn, (15 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.04)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

Add feedback

Utilizing Machine Learning and 3D Neuroimaging to Predict Hearing Loss: A Comparative Analysis of Dimensionality Reduction and Regression Techniques

Pittala, Trinath Sai Subhash Reddy, Meleti, Uma Maheswara R, Thatipamula, Manasa

arXiv.org Artificial IntelligenceMay-1-2024

In this project, we have explored machine learning approaches for predicting hearing loss thresholds on the brain's gray matter 3D images. We have solved the problem statement in two phases. In the first phase, we used a 3D CNN model to reduce high-dimensional input into latent space and decode it into an original image to represent the input in rich feature space. In the second phase, we utilized this model to reduce input into rich features and used these features to train standard machine learning models for predicting hearing thresholds. We have experimented with autoencoders and variational autoencoders in the first phase for dimensionality reduction and explored random forest, XGBoost and multi-layer perceptron for regressing the thresholds. We split the given data set into training and testing sets and achieved an 8.80 range and 22.57 range for PT500 and PT4000 on the test set, respectively. We got the lowest RMSE using multi-layer perceptron among the other models. Our approach leverages the unique capabilities of VAEs to capture complex, non-linear relationships within high-dimensional neuroimaging data. We rigorously evaluated the models using various metrics, focusing on the root mean squared error (RMSE). The results highlight the efficacy of the multi-layer neural network model, which outperformed other techniques in terms of accuracy. This project advances the application of data mining in medical diagnostics and enhances our understanding of age-related hearing loss through innovative machine-learning frameworks.

autoencoder, dimensionality reduction, hearing threshold, (12 more...)

arXiv.org Artificial Intelligence

2405.00142

Genre: Research Report > New Finding (0.47)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Dimensionality Reduction (0.61)

Add feedback

Training all-mechanical neural networks for task learning through in situ backpropagation

Li, Shuaifeng, Mao, Xiaoming

arXiv.org Artificial IntelligenceApr-23-2024

Recent advances unveiled physical neural networks as promising machine learning platforms, offering faster and more energy-efficient information processing. Compared with extensively-studied optical neural networks, the development of mechanical neural networks (MNNs) remains nascent and faces significant challenges, including heavy computational demands and learning with approximate gradients. Here, we introduce the mechanical analogue of in situ backpropagation to enable highly efficient training of MNNs. We demonstrate that the exact gradient can be obtained locally in MNNs, enabling learning through their immediate vicinity. With the gradient information, we showcase the successful training of MNNs for behavior learning and machine learning tasks, achieving high accuracy in regression and classification. Furthermore, we present the retrainability of MNNs involving task-switching and damage, demonstrating the resilience. Our findings, which integrate the theory for training MNNs and experimental and numerical validations, pave the way for mechanical machine learning hardware and autonomous self-learning material systems.

mnn, neural network, situ backpropagation, (14 more...)

arXiv.org Artificial Intelligence

2404.15471

Country:

North America > United States > California (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.88)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.64)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Mechanistic Neural Networks for Scientific Machine Learning

Pervez, Adeel, Locatello, Francesco, Gavves, Efstratios

arXiv.org Artificial IntelligenceFeb-20-2024

This paper presents Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences. It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations, revealing the underlying dynamics of data and enhancing interpretability and efficiency in data modeling. Central to our approach is a novel Relaxed Linear Programming Solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs. This integrates well with neural networks and surpasses the limitations of traditional ODE solvers enabling scalable GPU parallel processing. Overall, Mechanistic Neural Networks demonstrate their versatility for scientific machine learning applications, adeptly managing tasks from equation discovery to dynamic systems modeling. We prove their comprehensive capabilities in analyzing and interpreting complex scientific data across various applications, showing significant performance against specialized state-of-the-art methods.

equation, ode, solver, (14 more...)

arXiv.org Artificial Intelligence

2402.13077

Country:

Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Austria (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Meta-Learning for Neural Network-based Temporal Point Processes

Takimoto, Yoshiaki, Tanaka, Yusuke, Iwata, Tomoharu, Okawa, Maya, Kim, Hideaki, Toda, Hiroyuki, Kurashima, Takeshi

arXiv.org Artificial IntelligenceJan-28-2024

Human activities generate various event sequences such as taxi trip records, bike-sharing pick-ups, crime occurrence, and infectious disease transmission. The point process is widely used in many applications to predict such events related to human activities. However, point processes present two problems in predicting events related to human activities. First, recent high-performance point process models require the input of sufficient numbers of events collected over a long period (i.e., long sequences) for training, which are often unavailable in realistic situations. Second, the long-term predictions required in real-world applications are difficult. To tackle these problems, we propose a novel meta-learning approach for periodicity-aware prediction of future events given short sequences. The proposed method first embeds short sequences into hidden representations (i.e., task representations) via recurrent neural networks for creating predictions from short sequences. It then models the intensity of the point process by monotonic neural networks (MNNs), with the input being the task representations. We transfer the prior knowledge learned from related tasks and can improve event prediction given short sequences of target tasks. We design the MNNs to explicitly take temporal periodic patterns into account, contributing to improved long-term prediction performance. Experiments on multiple real-world datasets demonstrate that the proposed method has higher prediction performance than existing alternatives.

dataset, intensity function, sequence, (15 more...)

arXiv.org Artificial Intelligence

2401.15846

Country:

North America > United States > New York (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture > Yokohama (0.04)

Genre: Research Report (1.00)

Industry:

Transportation (0.68)
Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stability to Deformations of Manifold Filters and Manifold Neural Networks

Wang, Zhiyang, Ruiz, Luana, Ribeiro, Alejandro

arXiv.org Artificial IntelligenceSep-12-2023

The paper defines and studies manifold (M) convolutional filters and neural networks (NNs). \emph{Manifold} filters and MNNs are defined in terms of the Laplace-Beltrami operator exponential and are such that \emph{graph} (G) filters and neural networks (NNs) are recovered as discrete approximations when the manifold is sampled. These filters admit a spectral representation which is a generalization of both the spectral representation of graph filters and the frequency response of standard convolutional filters in continuous time. The main technical contribution of the paper is to analyze the stability of manifold filters and MNNs to smooth deformations of the manifold. This analysis generalizes known stability properties of graph filters and GNNs and it is also a generalization of known stability properties of standard convolutional filters and neural networks in continuous time. The most important observation that follows from this analysis is that manifold filters, same as graph filters and standard continuous time filters, have difficulty discriminating high frequency components in the presence of deformations. This is a challenge that can be ameliorated with the use of manifold, graph, or continuous time neural networks. The most important practical consequence of this analysis is to shed light on the behavior of graph filters and GNNs in large scale graphs.

operator, perturbation, stability, (15 more...)

arXiv.org Artificial Intelligence

2106.03725

Country: North America > United States > Pennsylvania (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learning and Generalizing Polynomials in Simulation Metamodeling

Hauch, Jesper, Riis, Christoffer, Pereira, Francisco C.

arXiv.org Artificial IntelligenceJul-20-2023

The ability to learn polynomials and generalize out-of-distribution is essential for simulation metamodels in many disciplines of engineering, where the time step updates are described by polynomials. While feed forward neural networks can fit any function, they cannot generalize out-of-distribution for higher-order polynomials. Therefore, this paper collects and proposes multiplicative neural network (MNN) architectures that are used as recursive building blocks for approximating higher-order polynomials. Our experiments show that MNNs are better than baseline models at generalizing, and their performance in validation is true to their performance in out-of-distribution tests. In addition to MNN architectures, a simulation metamodeling approach is proposed for simulations with polynomial time step updates. For these simulations, simulating a time interval can be performed in fewer steps by increasing the step size, which entails approximating higher-order polynomials. While our approach is compatible with any simulation with polynomial time step updates, a demonstration is shown for an epidemiology simulation model, which also shows the inductive bias in MNNs for learning and generalizing higher-order polynomials.

artificial intelligence, machine learning, polynomial, (19 more...)

arXiv.org Artificial Intelligence

2307.10892

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Epidemiology (0.34)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback