AITopics | linear layer

TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials

Neural Information Processing SystemsApr-28-2026, 15:37:46 GMT

The development of efficient machine learning models for molecular systems representation is becoming crucial in scientific research. We introduce TensorNet, an innovative O(3)-equivariant message-passing neural network architecture that leverages Cartesian tensor representations. By using Cartesian tensor atomic embeddings, feature mixing is simplified through matrix product operations. Furthermore, the cost-effective decomposition of these tensors into rotation group irreducible representations allows for the separate processing of scalars, vectors, and tensors when necessary. Compared to higher-rank spherical tensor models, TensorNet demonstrates state-of-the-art performance with significantly fewer parameters. For small molecule potential energies, this can be achieved even with a single interaction layer. As a result of all these properties, the model's computational cost is substantially decreased. Moreover, the accurate prediction of vector and tensor molecular quantities on top of potential energies and forces is possible. In summary, TensorNet's framework opens up a new space for the design of state-of-the-art equivariant models.

artificial intelligence, machine learning, representation, (16 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.47)
Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Stable and low-precision training for large-scale vision-language models

Neural Information Processing SystemsApr-25-2026, 19:27:40 GMT

We introduce new methods for 1) accelerating and 2) stabilizing training for large language-vision models.

large language model, machine learning, spike, (19 more...)

Neural Information Processing Systems

Country: North America > Canada (0.46)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

20bd42d82998bc61732c00452228e814-Paper-Conference.pdf

Neural Information Processing SystemsApr-25-2026, 19:27:37 GMT

large language model, loss spike, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > Canada (0.46)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

4588e674d3f0faf985047d4c3f13ed0d-Supplemental.pdf

Neural Information Processing SystemsApr-25-2026, 16:05:31 GMT

artificial intelligence, latexit sha1, machine learning, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback

Overcoming the Convex Barrier for Simplex Inputs: Supplementary Material

Neural Information Processing SystemsApr-25-2026, 04:20:52 GMT

Strong mixed-integer programming formulations for trained neural networks.

artificial intelligence, machine learning, relaxation, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

6 Supplementary Material 6.1 Network Architecture

Neural Information Processing SystemsApr-24-2026, 18:49:23 GMT

The section explains detailed CipherNav network architecture in Table 4, 5 and 6. The view encoder E is shown in Table 4 and map encoder E is shown in Table 5. The encoders are trained end-to-end during plaintext training and freezed during ciphertext training. Each party has a copy of the encoder models and locally computes all forward passes in ciphertext training. The action classification network Gis shown in Table 6.

artificial intelligence, machine learning, obstacle, (15 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.51)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.71)
Information Technology > Communications > Networks (0.61)

Add feedback

Circa: Stochastic ReLUs for Private Deep Learning

Neural Information Processing SystemsApr-24-2026, 18:30:08 GMT

The simultaneous rise of machine learning as a service and concerns over user privacy have increasingly motivated the need for private inference (PI). While recent work demonstrates PI is possible using cryptographic primitives, the computational overheads render it impractical. State-of-art deep networks are inadequate in this context because the source of slowdown in PI stems from the ReLU operations whereas optimizations for plaintext inference focus on reducing FLOPs. In this paper we re-think ReLU computations and propose optimizations for PI tailored to properties of neural networks. Specifically, we reformulate ReLU as an approximate sign test and introduce a novel truncation method for the sign test that significantly reduces the cost per ReLU. These optimizations result in a specific type of stochastic ReLU. The key observation is that the stochastic fault behavior is well suited for the fault-tolerant properties of neural network inference. Thus, we provide significant savings without impacting accuracy. We collectively call the optimizations Circa and demonstrate improvements of up to 4.7 storage and 3 runtime over baseline implementations; we further show that Circa can be used on top of recent PI optimizations to obtain 1.8 additional speedup.

artificial intelligence, circa, machine learning, (20 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

1165af8b913fb836c6280b42d6e0084f-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 15:49:23 GMT

artificial intelligence, experiment, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

Add feedback

0a4dc6dae338c9cb08947c07581f77a2-Paper.pdf

Neural Information Processing SystemsApr-24-2026, 15:16:13 GMT

logic & formal reasoning, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.68)

Add feedback

Low-Rank Compression of Pretrained Models via Randomized Subspace Iteration

Pourkamali-Anaraki, Farhad

arXiv.org Machine LearningApr-6-2026

The massive scale of pretrained models has made efficient compression essential for practical deployment. Low-rank decomposition based on the singular value decomposition (SVD) provides a principled approach for model reduction, but its exact computation is expensive for large weight matrices. Randomized alternatives such as randomized SVD (RSVD) improve efficiency, yet they can suffer from poor approximation quality when the singular value spectrum decays slowly, a regime commonly observed in modern pretrained models. In this work, we address this limitation from both theoretical and empirical perspectives. First, we establish a connection between low-rank approximation error and predictive performance by analyzing softmax perturbations, showing that deviations in class probabilities are controlled by the spectral error of the compressed weights. Second, we demonstrate that RSVD is inadequate, and we propose randomized subspace iteration (RSI) as a more effective alternative. By incorporating multiple power iterations, RSI improves spectral separation and provides a controllable mechanism for enhancing approximation quality. We evaluate our approach on both convolutional networks and transformer-based architectures. Our results show that RSI achieves near-optimal approximation quality while outperforming RSVD in predictive accuracy under aggressive compression, enabling efficient model compression.

approximation, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2604.02659

Country: