AITopics | abelian group

Collaborating Authors

abelian group

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Breaking Data Symmetry is Needed For Generalization in Feature Learning Kernels

Bernal, Marcel Tomàs, Mallinar, Neil Rohit, Belkin, Mikhail

arXiv.org Machine LearningApr-2-2026

Grokking occurs when a model achieves high training accuracy but generalization to unseen test points happens long after that. This phenomenon was initially observed on a class of algebraic problems, such as learning modular arithmetic (Power et al., 2022). We study grokking on algebraic tasks in a class of feature learning kernels via the Recursive Feature Machine (RFM) algorithm (Radhakrishnan et al., 2024), which iteratively updates feature matrices through the Average Gradient Outer Product (AGOP) of an estimator in order to learn task-relevant features. Our main experimental finding is that generalization occurs only when a certain symmetry in the training set is broken. Furthermore, we empirically show that RFM generalizes by recovering the underlying invariance group action inherent in the data. We find that the learned feature matrices encode specific elements of the invariance group, explaining the dependence of generalization on symmetry.

artificial intelligence, machine learning, reflection, (17 more...)

arXiv.org Machine Learning

2604.00316

Country:

North America > United States (0.28)
Africa > Middle East > Morocco > Tanger-Tetouan-Al Hoceima Region > Tangier (0.04)

Genre: Research Report (0.82)

Industry: Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

dfc310e81992d2e4cedc09ac47eff13e-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-12-2026, 09:43:24 GMT

decoder, parallelogram, representation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Appendix A Definitions of the phases of learning

Neural Information Processing SystemsAug-19-2025, 12:41:15 GMT

Similar to Eq. (12) where some validation samples can be derived from training samples, we

decoder, parallelogram, representation, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

FP=xINT:A Low-Bit Series Expansion Algorithm for Post-Training Quantization

Zhang, Boyang, Cheng, Daning, Zhang, Yunquan, Liu, Fangmin

arXiv.org Artificial IntelligenceDec-9-2024

Post-Training Quantization (PTQ) converts pre-trained Full-Precision (FP) models into quantized versions without training. While existing methods reduce size and computational costs, they also significantly degrade performance and quantization efficiency at extremely low settings due to quantization noise. We introduce a deep model series expansion framework to address this issue, enabling rapid and accurate approximation of unquantized models without calibration sets or fine-tuning. This is the first use of series expansion for neural network quantization. Specifically, our method expands the FP model into multiple low-bit basis models. To ensure accurate quantization, we develop low-bit basis model expansions at different granularities (tensor, layer, model), and theoretically confirm their convergence to the dense model, thus restoring FP model accuracy. Additionally, we design AbelianAdd/Mul operations between isomorphic models in the low-bit expansion, forming an Abelian group to ensure operation parallelism and commutativity. The experiments show that our algorithm achieves state-of-the-art performance in low-bit settings; for example, 4-bit quantization of ResNet-50 surpasses the original accuracy, reaching 77.03%. The code will be made public.

machine learning, natural language, quantization, (18 more...)

arXiv.org Artificial Intelligence

2412.06865

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Acceleration of Grokking in Learning Arithmetic Operations via Kolmogorov-Arnold Representation

Park, Yeachan, Kim, Minseok, Kim, Yeoneung

arXiv.org Artificial IntelligenceMay-26-2024

We propose novel methodologies aimed at accelerating the grokking phenomenon, which refers to the rapid increment of test accuracy after a long period of overfitting as reported in~\cite{power2022grokking}. Focusing on the grokking phenomenon that arises in learning arithmetic binary operations via the transformer model, we begin with a discussion on data augmentation in the case of commutative binary operations. To further accelerate, we elucidate arithmetic operations through the lens of the Kolmogorov-Arnold (KA) representation theorem, revealing its correspondence to the transformer architecture: embedding, decoder block, and classifier. Observing the shared structure between KA representations associated with binary operations, we suggest various transfer learning mechanisms that expedite grokking. This interpretation is substantiated through a series of rigorous experiments. In addition, our approach is successful in learning two nonstandard arithmetic tasks: composition of operations and a system of equations. Furthermore, we reveal that the model is capable of learning arithmetic operations using a limited number of tokens under embedding transfer, which is supported by a set of experiments as well.

arithmetic operation, opération, representation, (15 more...)

arXiv.org Artificial Intelligence

2405.16658

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.40)

Industry: Education > Curriculum > Subject-Specific Education (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

MathGloss: Building mathematical glossaries from text

Horowitz, Lucy, de Paiva, Valeria

arXiv.org Artificial IntelligenceNov-21-2023

MathGloss is a project to create a knowledge graph (KG) for undergraduate mathematics from text, automatically, using modern natural language processing (NLP) tools and resources already available on the web. MathGloss is a linked database of undergraduate concepts in mathematics. So far, it combines five resources: (i) Wikidata, a collaboratively edited, multilingual knowledge graph hosted by the Wikimedia Foundation, (ii) terms covered in mathematics courses at the University of Chicago, (iii) the syllabus of the French undergraduate mathematics curriculum which includes hyperlinks to the automated theorem prover Lean 4, (iv) MuLiMa, a multilingual dictionary of mathematics curated by mathematicians, and (v) the nLab, a wiki for category theory also curated by mathematicians. MathGloss's goal is to bring together resources for learning mathematics and to allow every mathematician to tailor their learning to their own preferences. Moreover, by organizing different resources for learning undergraduate mathematics alongside those for learning formal mathematics, we hope to make it easier for mathematicians and formal tools (theorem provers, computer algebra systems, etc) experts to "understand" each other and break down some of the barriers to formal math.

abelian group, mathematics, mathgloss, (16 more...)

arXiv.org Artificial Intelligence

2311.12649

Country:

North America > United States > Illinois > Cook County > Chicago (0.26)
South America > Paraguay > Asunción > Asunción (0.04)
Europe > Slovenia (0.04)
(2 more...)

Genre: Instructional Material > Course Syllabus & Notes (0.66)

Industry: Education > Curriculum > Subject-Specific Education (0.55)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Learning representations by forward-propagating errors

Jang, Ryoungwoo

arXiv.org Artificial IntelligenceAug-17-2023

In 1986, Rumelhart, Hinton, Williams suggested learning algorithm of the neural network, which is now usually called as back-propagation (BP) Rumelhart et al. [1986]. Since then, deep neural networks became trainable algorithm and is prospered by AlexNet Krizhevsky et al. [2017]. Uncountable researches have been proposed to train more accurate models, analyze model behavior, and enumerous fields. However, there is one profound question: Is the learning rule for neural network unique? It seems that Geoffrey Hinton have contemplated this problem for a long time. In a paper Lillicrap et al. [2020], Hinton and his colleagues approached issues of backpropagation in various perspectives. In 2022, Hinton have suggested a new learning rule named forward-forward algorithm Hinton [2022].

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2308.09728

Country: Asia > South Korea > Seoul > Seoul (0.05)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Quantum algorithms for group convolution, cross-correlation, and equivariant transformations

Castelazo, Grecia, Nguyen, Quynh T., De Palma, Giacomo, Englund, Dirk, Lloyd, Seth, Kiani, Bobak T.

arXiv.org Artificial IntelligenceSep-6-2022

Group convolutions and cross-correlations, which are equivariant to the actions of group elements, are commonly used in mathematics to analyze or take advantage of symmetries inherent in a given problem setting. Here, we provide efficient quantum algorithms for performing linear group convolutions and cross-correlations on data stored as quantum states. Runtimes for our algorithms are logarithmic in the dimension of the group thus offering an exponential speedup compared to classical algorithms when input data is provided as a quantum state and linear operations are well conditioned. Motivated by the rich literature on quantum algorithms for solving algebraic problems, our theoretical framework opens a path for quantizing many algorithms in machine learning and numerical methods that employ group operations.

algorithm, opération, representation, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1103/PhysRevA.106.032402

2109.1133

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Implicit Bias of Linear Equivariant Networks

Lawrence, Hannah, Georgiev, Kristian, Dienes, Andrew, Kiani, Bobak T.

arXiv.org Artificial IntelligenceOct-12-2021

Group equivariant convolutional neural networks (G-CNNs) are generalizations of convolutional neural networks (CNNs) which excel in a wide range of scientific and technical applications by explicitly encoding group symmetries, such as rotations and permutations, in their architectures. Although the success of G-CNNs is driven by the explicit symmetry bias of their convolutional architecture, a recent line of work has proposed that the implicit bias of training algorithms on a particular parameterization (or architecture) is key to understanding generalization for overparameterized neural nets. In this context, we show that $L$-layer full-width linear G-CNNs trained via gradient descent in a binary classification task converge to solutions with low-rank Fourier matrix coefficients, regularized by the $2/L$-Schatten matrix norm. Our work strictly generalizes previous analysis on the implicit bias of linear CNNs to linear G-CNNs over all finite groups, including the challenging setting of non-commutative symmetry groups (such as permutations). We validate our theorems via experiments on a variety of groups and empirically explore more realistic nonlinear networks, which locally capture similar regularization patterns. Finally, we provide intuitive interpretations of our Fourier space implicit regularization results in real space via uncertainty principles.

equation, fourier transform, representation, (16 more...)

arXiv.org Artificial Intelligence

2110.06084

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Strictly proper kernel scores and characteristic kernels on compact spaces

Steinwart, Ingo, Ziegel, Johanna F.

arXiv.org Machine LearningDec-14-2017

Strictly proper kernel scores are well-known tool in probabilistic forecasting, while characteristic kernels have been extensively investigated in the machine learning literature. We first show that both notions coincide, so that insights from one part of the literature can be used in the other. We then show that the metric induced by a characteristic kernel cannot reliably distinguish between distributions that are far apart in the total variation norm as soon as the underlying space of measures is infinite dimensional. In addition, we provide a characterization of characteristic kernels in terms of eigenvalues and -functions and apply this characterization to the case of continuous kernels on (locally) compact spaces. In the compact case we further show that characteristic kernels exist if and only if the space is metrizable. As special cases of our general theory we investigate translation-invariant kernels on compact Abelian groups and isotropic kernels on spheres. The latter are of particular interest for forecast evaluation of probabilistic predictions on spherical domains as frequently encountered in meteorology and climatology.

artificial intelligence, kernel, machine learning, (17 more...)

arXiv.org Machine Learning

1712.05279

Country: North America > United States (0.93)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback