AITopics

Neural Information Processing SystemsAug-16-2025, 00:47:44 GMT

Supplementary Material Appendices

Throughout the paper, our analysis is based on the NTK prameterization [10], under which the constancy of tangent kernel is originally observed. In this section, we show that different parame-terization strategies (e.g., LeCun initialization [LeCun et al.]

artificial intelligence, machine learning, probability, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Graph Automorphism Group Equivariant Neural Networks

arXiv.org Artificial IntelligenceJul-15-2023

For any graph $G$ having $n$ vertices and its automorphism group $\textrm{Aut}(G)$, we provide a full characterisation of all of the possible $\textrm{Aut}(G)$-equivariant neural networks whose layers are some tensor power of $\mathbb{R}^{n}$. In particular, we find a spanning set of matrices for the learnable, linear, $\textrm{Aut}(G)$-equivariant layer functions between such tensor power spaces in the standard basis of $\mathbb{R}^{n}$.

artificial intelligence, machine learning, vertex, (15 more...)

2307.0781

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

How Jellyfish Characterise Alternating Group Equivariant Neural Networks

arXiv.org Artificial IntelligenceJun-18-2023

We provide a full characterisation of all of the possible alternating group ($A_n$) equivariant neural networks whose layers are some tensor power of $\mathbb{R}^{n}$. In particular, we find a basis of matrices for the learnable, linear, $A_n$-equivariant layer functions between such tensor power spaces in the standard basis of $\mathbb{R}^{n}$. We also describe how our approach generalises to the construction of neural networks that are equivariant to local symmetries.

artificial intelligence, machine learning, partition, (18 more...)

2301.10152

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Brauer's Group Equivariant Neural Networks

arXiv.org Artificial IntelligenceJun-18-2023

We provide a full characterisation of all of the possible group equivariant neural networks whose layers are some tensor power of $\mathbb{R}^{n}$ for three symmetry groups that are missing from the machine learning literature: $O(n)$, the orthogonal group; $SO(n)$, the special orthogonal group; and $Sp(n)$, the symplectic group. In particular, we find a spanning set of matrices for the learnable, linear, equivariant layer functions between such tensor power spaces in the standard basis of $\mathbb{R}^{n}$ when the group is $O(n)$ or $SO(n)$, and in the symplectic basis of $\mathbb{R}^{n}$ when the group is $Sp(n)$.

artificial intelligence, diagram, machine learning, (15 more...)

2212.0863

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > Wales (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Education (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Connecting Permutation Equivariant Neural Networks and Partition Diagrams

arXiv.org Artificial IntelligenceJan-12-2023

We show how the Schur-Weyl duality that exists between the partition algebra and the symmetric group results in a stronger theoretical foundation for characterising all of the possible permutation equivariant neural networks whose layers are some tensor power of the permutation representation $M_n$ of the symmetric group $S_n$. In doing so, we unify two separate bodies of literature, and we correct some of the major results that are now widely quoted by the machine learning community. In particular, we find a basis of matrices for the learnable, linear, permutation equivariant layer functions between such tensor power spaces in the standard basis of $M_n$ by using an elegant graphical representation of a basis of set partitions for the partition algebra and its related vector spaces. Also, we show how we can calculate the number of weights that must appear in these layer functions by looking at certain paths through the McKay quiver for $M_n$. Finally, we describe how our approach generalises to the construction of neural networks that are equivariant to local symmetries.

artificial intelligence, machine learning, representation, (11 more...)

2212.08648

Country:

Asia > South Korea > Seoul > Seoul (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)

Genre: Research Report (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hajij, Mustafa, Istvan, Kyle

Topological Deep Learning: Classification Neural Networks

arXiv.org Machine LearningFeb-16-2021

Topological deep learning is a formalism that is aimed at introducing topological language to deep learning for the purpose of utilizing the minimal mathematical structures to formalize problems that arise in a generic deep learning problem. This is the first of a sequence of articles with the purpose of introducing and studying this formalism. In this article, we define and study the classification problem in machine learning in a topological setting. Using this topological framework, we show when the classification problem is possible or not possible in the context of neural networks. Finally, we show that for a given data, the architecture of a classification neural network must take into account the topology of this data in order to achieve a successful classification task.

arxiv preprint arxiv, neural network, topologically, (9 more...)

arXiv.org Machine Learning

2102.08354

Country: North America > United States (0.14)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Therapeutic Area (0.47)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Zhang, Zhe, Lan, Guanghui

Optimal Algorithms for Convex Nested Stochastic Composite Optimization

arXiv.org Artificial IntelligenceNov-28-2020

Recently, convex nested stochastic composite optimization (NSCO) has received considerable attention for its application in reinforcement learning and risk-averse optimization. However, In the current literature, there exists a significant gap in the iteration complexities between these NSCO problems and other simpler stochastic composite optimization problems (e.g., sum of smooth and nonsmooth functions) without the nested structure. In this paper, we close the gap by reformulating a class of convex NSCO problems as "$\min\max\ldots \max$" saddle point problems under mild assumptions and proposing two primal-dual type algorithms with the optimal $\mathcal{O}\{1/\epsilon^2\}$ (resp., $\mathcal{O}\{1/\epsilon\}$) complexity for solving nested (resp., strongly) convex problems. More specifically, for the often-considered two-layer smooth-nonsmooth problem, we introduce a simple vanilla stochastic sequential dual (SSD) algorithm which can be implemented purely in the primal form. For the multi-layer problem, we propose a general stochastic sequential dual framework. The framework consists of modular dual updates for different types of functions (smooth, smoothable, and non-smooth, etc.), so that it can handle a more general composition of layer functions. Moreover, we present modular convergence proofs to show that the complexity of the general SSD is optimal with respect to nearly all the problem parameters.

complexity, layer function, proximal update, (16 more...)

2011.10076

Country: North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Hajij, Mustafa, Istvan, Kyle

A Topological Framework for Deep Learning

arXiv.org Machine LearningOct-21-2020

We utilize classical facts from topology to show that the classification problem in machine learning is always solvable under very mild conditions. Furthermore, we show that a softmax classification network acts on an input topological space by a finite sequence of topological moves to achieve the classification task. Moreover, given a training dataset, we show how topological formalism can be used to suggest the appropriate architectural choices for neural networks designed to be trained as classifiers on the data. Finally, we show how the architecture of a neural network cannot be chosen independently from the shape of the underlying data. To demonstrate these results, we provide example datasets and show how they are acted upon by neural nets from this topological perspective.

artificial intelligence, machine learning, neural network, (19 more...)

arXiv.org Machine Learning

2008.13697

Country: Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.85)

@machinelearnbotMay-19-2018, 15:40:25 GMT

Complete Guide to Build ConvNet HTTP-Based Application using TensorFlow and Flask RESTful Python API

This tutorial takes you along the steps required to create a convolutional neural network (CNN/ConvNet) using TensorFlow and get it into production by allowing remote access via a HTTP-based application using Flask RESTful API. In this tutorial, a CNN is to be built using TensorFlow NN (tf.nn) module. The CNN model architecture is created and trained and tested against the CIFAR10 dataset. To make the model remotely accessible, a Flask Web application is created using Python to receive an uploaded image and return its classification label using HTTP. Anaconda3 is used in addition to TensorFlow on Windows with CPU support.

artificial intelligence, machine learning, python, (15 more...)

@machinelearnbot

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)