AITopics

Country:

North America > Canada (0.28)
Asia > Middle East > UAE (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry:

Information Technology (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.94)

Neural Information Processing SystemsMar-27-2025, 09:47:10 GMT

A Supplementary material

A.1 Derivation of Theorem 3.1 Let G be a connected Lie group of transformations acting on the n-th dimensional manifold X. In our special case, the infinitesimal generator h is a matrix, which generates the one parameter group of transformations equal to exp(h t). A.2 LieGG sample complexity LieGG computation requires finding a nullspace of the network polarization matrix. The dimensionality of the network polarization matrix depends on the number of samples in a dataset. Since the dimensionality of a full-scale dataset may be prohibitively large, we conduct the sample complexity study to investigate if the usage of all samples in a dataset is necessary to effectively retrieve the symmetries learned by a neural network.

artificial intelligence, machine learning, symmetry, (14 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.32)

Neural Information Processing SystemsMar-27-2025, 09:47:07 GMT

LieGG: Studying Learned Lie Group Generators

Symmetries built into a neural network have appeared to be very beneficial for a wide range of tasks as it saves the data to learn them. We depart from the position that when symmetries are not built into a model a priori, it is advantageous for robust networks to learn symmetries directly from the data to fit a task function. In this paper, we present a method to extract symmetries learned by a neural network and to evaluate the degree to which a network is invariant to them. With our method, we are able to explicitly retrieve learned invariances in a form of the generators of corresponding Lie-groups without prior knowledge of symmetries in the data. We use the proposed method to study how symmetrical properties depend on a neural network's parameterization and configuration. We found that the ability of a network to learn symmetries generalizes over a range of architectures. However, the quality of learned symmetries depends on the depth and the number of parameters.

artificial intelligence, machine learning, symmetry, (17 more...)

Country: Europe > Netherlands (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.47)

Neural Information Processing SystemsMar-27-2025, 08:48:29 GMT

Amortized Fourier Neural Operators

Fourier Neural Operators (FNOs) have shown promise for solving partial differential equations (PDEs). Typically, FNOs employ separate parameters for different frequency modes to specify tunable kernel integrals in Fourier space, which, yet, results in an undesirably large number of parameters when solving high-dimensional PDEs. A workaround is to abandon the frequency modes exceeding a predefined threshold, but this limits the FNOs' ability to represent high-frequency details and poses non-trivial challenges for hyper-parameter specification. To address these, we propose AMortized Fourier Neural Operator (AM-FNO), where an amortized neural parameterization of the kernel function is deployed to accommodate arbitrarily many frequency modes using a fixed number of parameters. We introduce two implementations of AM-FNO, based on the recently developed, appealing Kolmogorov-Arnold Network (KAN) and Multi-Layer Perceptrons (MLPs) equipped with orthogonal embedding functions respectively. We extensively evaluate our method on diverse datasets from various domains and observe up to 31% average improvement compared to competing neural operator baselines.

artificial intelligence, benchmark, machine learning, (19 more...)

Country: Asia > China (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

arXiv.org Artificial IntelligenceMar-27-2025

Tune It Up: Music Genre Transfer and Prediction

Samet, Fidan, Bakir, Oguz, Fidan, Adnan

Deep generative models have been used in style transfer tasks for images. In this study, we adapt and improve CycleGAN model to perform music style transfer on Jazz and Classic genres. By doing so, we aim to easily generate new songs, cover music to different music genres and reduce the arrangements needed in those processes. We train and use music genre classifier to assess the performance of the transfer models. To that end, we obtain 87.7% accuracy with Multi-layer Perceptron algorithm. To improve our style transfer baseline, we add auxiliary discriminators and triplet loss to our model. According to our experiments, we obtain the best accuracies as 69.4% in Jazz to Classic task and 39.3% in Classic to Jazz task with our developed genre classifier. We also run a subjective experiment and results of it show that the overall performance of our transfer model is good and it manages to conserve melody of inputs on the transferred outputs. Our code is available at https://github.com/ fidansamet/tune-it-up

accuracy, artificial intelligence, machine learning, (16 more...)

2503.22008

Genre: Research Report > New Finding (0.48)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsMar-26-2025, 23:07:34 GMT

Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds

Bo Yang, Jianan Wang, Ronald Clark, Qingyong Hu, Sen Wang, Andrew Markham, Niki Trigoni

We propose a novel, conceptually simple and general framework for instance segmentation on 3D point clouds. Our method, called 3D-BoNet, follows the simple design philosophy of per-point multilayer perceptrons (MLPs).

artificial intelligence, machine learning, segmentation, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Aslan, Haci Ismail, Wiesner, Philipp, Xiong, Ping, Kao, Odej

$\beta$-GNN: A Robust Ensemble Approach Against Graph Structure Perturbation

arXiv.org Artificial IntelligenceMar-26-2025

Graph Neural Networks (GNNs) are playing an increasingly important role in the efficient operation and security of computing systems, with applications in workload scheduling, anomaly detection, and resource management. However, their vulnerability to network perturbations poses a significant challenge. We propose $\beta$-GNN, a model enhancing GNN robustness without sacrificing clean data performance. $\beta$-GNN uses a weighted ensemble, combining any GNN with a multi-layer perceptron. A learned dynamic weight, $\beta$, modulates the GNN's contribution. This $\beta$ not only weights GNN influence but also indicates data perturbation levels, enabling proactive mitigation. Experimental results on diverse datasets show $\beta$-GNN's superior adversarial accuracy and attack severity quantification. Crucially, $\beta$-GNN avoids perturbation assumptions, preserving clean data structure and performance.

artificial intelligence, data mining, machine learning, (16 more...)

doi: 10.1145/3721146.3721949

2503.2063

Country:

North America > United States (1.00)
Europe (1.00)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (0.49)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Sousa, Rita T., Paulheim, Heiko

Multi-dataset and Transfer Learning Using Gene Expression Knowledge Graphs

arXiv.org Artificial IntelligenceMar-26-2025

Gene expression datasets offer insights into gene regulation mechanisms, biochemical pathways, and cellular functions. Additionally, comparing gene expression profiles between disease and control patients can deepen the understanding of disease pathology. Therefore, machine learning has been used to process gene expression data, with patient diagnosis emerging as one of the most popular applications. Although gene expression data can provide valuable insights, challenges arise because the number of patients in expression datasets is usually limited, and the data from different datasets with different gene expressions cannot be easily combined. This work proposes a novel methodology to address these challenges by integrating multiple gene expression datasets and domain-specific knowledge using knowledge graphs, a unique tool for biomedical data integration. Then, vector representations are produced using knowledge graph embedding techniques, which are used as inputs for a graph neural network and a multi-layer perceptron. We evaluate the efficacy of our methodology in three settings: single-dataset learning, multi-dataset learning, and transfer learning. The experimental results show that combining gene expression datasets and domain-specific knowledge improves patient diagnosis in all three settings.

artificial intelligence, dataset, machine learning, (14 more...)

2503.204

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Garcia, Roberto, Liu, Jerry, Sorvisto, Daniel, Eyuboglu, Sabri

Adaptive Rank Allocation: Speeding Up Modern Transformers with RaNA Adapters

arXiv.org Artificial IntelligenceMar-23-2025

Large Language Models (LLMs) are computationally intensive, particularly during inference. Neuron-adaptive techniques, which selectively activate neurons in Multi-Layer Perceptron (MLP) layers, offer some speedups but suffer from limitations in modern Transformers. These include reliance on sparse activations, incompatibility with attention layers, and the use of costly neuron masking techniques. To address these issues, we propose the Adaptive Rank Allocation framework and introduce the Rank and Neuron Allocator (RaNA) adapter. RaNA adapters leverage rank adapters, which operate on linear layers by applying both low-rank matrix decompositions and adaptive masking to efficiently allocate compute without depending on activation sparsity. This enables RaNA to be generally applied to MLPs and linear components of attention modules, while eliminating the need for expensive maskers found in neuron-adaptive methods. Notably, when compared to neuron adapters, RaNA improves perplexity by up to 7 points and increases accuracy by up to 8 percentage-points when reducing FLOPs by 44% in state-of-the-art Transformer architectures. As Large Language Models (LLMs) have grown in popularity and size, they have begun consuming a non-trivial amount of compute and time for training and inference (Kim et al. (2023), Pope et al. (2022)). Adaptive compute methods seek to speed up the inference stage of Transformers (Vaswani et al. (2023)), the de facto LLM architecture, by identifying and avoiding redundant computations to save I/O and floating-point operations (FLOPs).

large language model, machine learning, natural language, (20 more...)