AITopics

2506.05567

Genre: Research Report > New Finding (0.48)

Industry: Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Biloš, Marin, Schneider, Anderson, Nevmyvaka, Yuriy

Speculative Sampling for Parametric Temporal Point Processes

arXiv.org Artificial IntelligenceOct-24-2025

Temporal point processes are powerful generative models for event sequences that capture complex dependencies in time-series data. They are commonly specified using autoregressive models that learn the distribution of the next event from the previous events. This makes sampling inherently sequential, limiting efficiency. In this paper, we propose a novel algorithm based on rejection sampling that enables exact sampling of multiple future values from existing TPP models, in parallel, and without requiring any architectural changes or retraining. Besides theoretical guarantees, our method demonstrates empirical speedups on real-world datasets, bridging the gap between expressive modeling and efficient parallel generation for large-scale TPP applications.

artificial intelligence, machine learning, natural language, (19 more...)

2510.20031

Genre: Research Report (1.00)

Industry: Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceMay-7-2025

A New Perspective To Understanding Multi-resolution Hash Encoding For Neural Fields

Luo, Steven Tin Sui

Instant-NGP has been the state-of-the-art architecture of neural fields in recent years. Its incredible signal-fitting capabilities are generally attributed to its multi-resolution hash grid structure and have been used and improved in numerous following works. However, it is unclear how and why such a hash grid structure improves the capabilities of a neural network by such great margins. A lack of principled understanding of the hash grid also implies that the large set of hyperparameters accompanying Instant-NGP could only be tuned empirically without much heuristics. To provide an intuitive explanation of the working principle of the hash grid, we propose a novel perspective, namely domain manipulation. This perspective provides a ground-up explanation of how the feature grid learns the target signal and increases the expressivity of the neural field by artificially creating multiples of pre-existing linear segments. We conducted numerous experiments on carefully constructed 1-dimensional signals to support our claims empirically and aid our illustrations. While our analysis mainly focuses on 1-dimensional signals, we show that the idea is generalizable to higher dimensions.

artificial intelligence, expressivity, machine learning, (15 more...)

2505.03042

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Halfon, Matan, Cohen, Tomer, Fattal, Raanan, Schneidman-Duhovny, Dina

ContactNet: Geometric-Based Deep Learning Model for Predicting Protein-Protein Interactions

arXiv.org Artificial IntelligenceJun-26-2024

Deep learning approaches achieved significant progress in predicting protein structures. These methods are often applied to protein-protein interactions (PPIs) yet require Multiple Sequence Alignment (MSA) which is unavailable for various interactions, such as antibody-antigen. Computational docking methods are capable of sampling accurate complex models, but also produce thousands of invalid configurations. The design of scoring functions for identifying accurate models is a long-standing challenge. We develop a novel attention-based Graph Neural Network (GNN), ContactNet, for classifying PPI models obtained from docking algorithms into accurate and incorrect ones. When trained on docked antigen and modeled antibody structures, ContactNet doubles the accuracy of current state-of-the-art scoring functions, achieving accurate models among its Top-10 at 43% of the test cases. When applied to unbound antibodies, its Top-10 accuracy increases to 65%. This performance is achieved without MSA and the approach is applicable to other types of interactions, such as host-pathogens or general PPIs.

bioinformatics, contactnet, prediction, (15 more...)

2406.18314

Country: Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.06)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-12-2023

On learning spatial sequences with the movement of attention

Osaulenko, Viacheslav M.

In this paper we start with a simple question, how is it possible that humans can recognize different movements over skin with only a prior visual experience of them? Or in general, what is the representation of spatial sequences that are invariant to scale, rotation, and translation across different modalities? To answer, we rethink the mathematical representation of spatial sequences, argue against the minimum description length principle, and focus on the movements of attention. We advance the idea that spatial sequences must be represented on different levels of abstraction, this adds redundancy but is necessary for recognition and generalization. To address the open question of how these abstractions are formed we propose two hypotheses: the first invites exploring selectionism learning, instead of finding parameters in some models; the second proposes to find new data structures, not neural network architectures, to efficiently store and operate over redundant features to be further selected. Movements of attention are central to human cognition and lessons should be applied to new better learning algorithms.

representation, sequence, spatial sequence, (17 more...)

2311.06856

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
Europe > Croatia > Primorje-Gorski Kotar County > Rijeka (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.49)

arXiv.org Artificial IntelligenceMay-4-2023

Variations on a Theme by Blahut and Arimoto

Chen, Lingyi, Wu, Shitong, Ye, Wenhao, Wu, Huihui, Zhang, Wenyi, Wu, Hao, Bai, Bo

The Blahut-Arimoto (BA) algorithm has played a fundamental role in the numerical computation of rate-distortion (RD) functions. This algorithm possesses a desirable monotonic convergence property by alternatively minimizing its Lagrangian with a fixed multiplier. In this paper, we propose a novel modification of the BA algorithm, letting the multiplier be updated in each iteration via a one-dimensional root-finding step with respect to a monotonic univariate function, which can be efficiently implemented by Newton's method. This allows the multiplier to be updated in a flexible and efficient manner, overcoming a major drawback of the original BA algorithm wherein the multiplier is fixed throughout iterations. Consequently, the modified algorithm is capable of directly computing the RD function for a given target distortion, without exploring the entire RD curve as in the original BA algorithm. A theoretical analysis shows that the modified algorithm still converges to the RD function and the convergence rate is $\Theta(1/n)$, where $n$ denotes the number of iterations. Numerical experiments demonstrate that the modified algorithm directly computes the RD function with a given target distortion, and it significantly accelerates the original BA algorithm.

algorithm, artificial intelligence, mathematics of computing, (18 more...)

2305.0265

Country:

Asia > Taiwan > Taiwan Province > Taipei (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Mathematics of Computing (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Molnar, Christoph, Casalicchio, Giuseppe, Bischl, Bernd

Quantifying Interpretability of Arbitrary Machine Learning Models Through Functional Decomposition

arXiv.org Machine LearningApr-8-2019

To obtain interpretable machine learning models, either interpretable models are constructed from the outset - e.g. shallow decision trees, rule lists, or sparse generalized linear models - or post-hoc interpretation methods - e.g. partial dependence or ALE plots - are employed. Both approaches have disadvantages. While the former can restrict the hypothesis space too conservatively, leading to potentially suboptimal solutions, the latter can produce too verbose or misleading results if the resulting model is too complex, especially w.r.t. feature interactions. We propose to make the compromise between predictive power and interpretability explicit by quantifying the complexity / interpretability of machine learning models. Based on functional decomposition, we propose measures of number of features used, interaction strength and main effect complexity. We show that post-hoc interpretation of models that minimize the three measures becomes more reliable and compact. Furthermore, we demonstrate the application of such measures in a multi-objective optimization approach which considers predictive power and interpretability at the same time.

artificial intelligence, interpretability, machine learning, (17 more...)

1904.03867

Country:

Europe > Germany (0.28)
Europe > Austria (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)

Li, Yuanzhi, Singer, Yoram

The Well Tempered Lasso

arXiv.org Machine LearningJun-8-2018

We study the complexity of the entire regularization path for least squares regression with 1-norm penalty, known as the Lasso. Every regression parameter in the Lasso changes linearly as a function of the regularization value. The number of changes is regarded as the Lasso's complexity. Experimental results using exact path following exhibit polynomial complexity of the Lasso in the problem size. Alas, the path complexity of the Lasso on artificially designed regression problems is exponential. We use smoothed analysis as a mechanism for bridging the gap between worst case settings and the de facto low complexity. Our analysis assumes that the observed data has a tiny amount of intrinsic noise. We then prove that the Lasso's complexity is polynomial in the problem size. While building upon the seminal work of Spielman and Teng on smoothed complexity, our analysis is morally different as it is divorced from specific path following algorithms. We verify the validity of our analysis in experiments with both worst case settings and real datasets. The empirical results we obtain closely match our analysis.

artificial intelligence, linear segment, machine learning, (19 more...)

1806.0319

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Bento, Jose, Ray, Surjyendu

On the Complexity of the Weighted Fused Lasso

arXiv.org Machine LearningFeb-1-2018

Abstract--The solution path of the 1D fused lasso for ann - dimensional input is piecewise linear with O (n) segments [1], [2]. However, existing proofs of this bound do not hold for the weighted fused lasso. We also give a new, very simple, proof of the O (n) bound for the fused lasso. There are efficient algorithms to solve 1FL for a fixedγ . This algorithm has recently been improved to a version, [8], that finishes inO (n) steps. Iterative algorithms, mostly first-order fixed-point methods, include [11]-[19]. Some of these are based on the ADMM method, known to achieve the fastest possible convergence rate among all first other methods, [20], [21]. However, in many applications, when precision is crucial, or when implementing a termination procedure has a non-negligible computational cost, direct algorithm are preferred.

algorithm, artificial intelligence, programming language, (17 more...)

1801.04987

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)
Information Technology > Software > Programming Languages (0.34)

Basri, Ronen, Jacobs, David

Efficient Representation of Low-Dimensional Manifolds using Deep Networks

arXiv.org Machine LearningFeb-15-2016

We consider the ability of deep neural networks to represent data that lies near a low-dimensional manifold in a high-dimensional space. We show that deep networks can efficiently extract the intrinsic, low-dimensional coordinates of such data. We first show that the first two layers of a deep network can exactly embed points lying on a monotonic chain, a special type of piecewise linear manifold, mapping them to a low-dimensional Euclidean space. Remarkably, the network can do this using an almost optimal number of parameters. We also show that this network projects nearby points onto the manifold and then embeds them with little error. We then extend these results to more general manifolds.

artificial intelligence, machine learning, manifold, (16 more...)

1602.04723

Country: North America > United States > Maryland (0.28)

Genre: Research Report (0.50)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)