AITopics | mspe

Collaborating Authors

mspe

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Integrative Learning of Dynamically Evolving Multiplex Graphs and Nodal Attributes Using Neural Network Gaussian Processes with an Application to Dynamic Terrorism Graphs

Rodriguez-Acosta, Jose, Guha, Sharmistha, Patel, Lekha, Shuler, Kurtis

arXiv.org Machine LearningMar-24-2026

Exploring the dynamic co-evolution of multiplex graphs and nodal attributes is a compelling question in criminal and terrorism networks. This article is motivated by the study of dynamically evolving interactions among prominent terrorist organizations, considering various organizational attributes like size, ideology, leadership, and operational capacity. Statistically principled integration of multiplex graphs with nodal attributes is significantly challenging due to the need to leverage shared information within and across layers, account for uncertainty in predicting unobserved links, and capture temporal evolution of node attributes. These difficulties increase when layers are partially observed, as in terrorism networks where connections are deliberately hidden to obscure key relationships. To address these challenges, we present a principled methodological framework to integrate the multiplex graph layers and nodal attributes. The approach employs time-varying stochastic latent factor models, leveraging shared latent factors to capture graph structure and its co-evolution with node attributes. Latent factors are modeled using Gaussian processes with an infinitely wide deep neural network-based covariance function, termed neural network Gaussian processes (NN-GP). The NN-GP framework on latent factors exploits the predictive power of Bayesian deep neural network architecture while propagating uncertainty for reliability. Simulation studies highlight superior performance of the proposed approach in achieving inferential objectives. The approach, termed as dynamic joint learner, enables predictive inference (with uncertainty) of diverse unobserved dynamic relationships among prominent terrorist organizations and their organization-specific attributes, as well as clustering behavior in terms of friend-and-foe relationships, which could be informative in counter-terrorism research.

artificial intelligence, machine learning, nodal, (19 more...)

arXiv.org Machine Learning

2603.20962

Country:

South America > Colombia (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Texas (0.04)
(13 more...)

Genre: Research Report > New Finding (0.67)

Industry: Law Enforcement & Public Safety > Terrorism (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution

Neural Information Processing SystemsFeb-10-2026, 22:45:49 GMT

The term resolution refers to the width and height of images input into neural networks.

machine learning, natural language, resolution, (16 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)

Add feedback

MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution

Neural Information Processing SystemsDec-24-2025, 21:27:54 GMT

artificial intelligence, proceedings, resolution, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

3396657fe1a3c9a43ac7cd809c51a41e-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 22:53:54 GMT

computer vision, proceedings, resolution, (13 more...)

Neural Information Processing Systems

Country: Asia > China (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Supplementary Materials of "BAST: Bayesian Additive Regression Spanning Trees for Complex Constrained Domain "

Neural Information Processing SystemsAug-13-2025, 15:58:32 GMT

These appendices provide supplementary details and results of BAST. Appendix A contains additional details on Bayesian estimation and prediction. Prediction at u is then performed as stated in Section 3.2. The experiment setup is the same as in Section 4.1 Table S3 shows the performance of BAST and BART using the hyperparameters chosen by CV (referred to as BAST -cv and BART -cv, respectively). As a benchmark, the performance metrics for BAST and BART using the hyperparameters in Section 4.1 are also included (referred to as Standard errors are given in parentheses.

Add feedback

MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution

Neural Information Processing SystemsMay-26-2025, 20:53:18 GMT

Although Vision Transformers (ViTs) have recently advanced computer vision tasks significantly, an important real-world problem was overlooked: adapting to variable input resolutions. Typically, images are resized to a fixed resolution, such as 224x224, for efficiency during training and inference. However, uniform input size conflicts with real-world scenarios where images naturally vary in resolution. In this work, we propose to enhance the model adaptability to resolution variation by optimizing the patch embedding. The proposed method, called Multi-Scale Patch Embedding (MSPE), substitutes the standard patch embedding with multiple variable-sized patch kernels and selects the best parameters for different resolutions, eliminating the need to resize the original image. Our method does not require high-cost training or modifications to other parts, making it easy to apply to most ViT models.

artificial intelligence, patch embedding prompt vision transformer, resolution, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

Improving the Convergence Rates of Forward Gradient Descent with Repeated Sampling

Dexheimer, Niklas, Schmidt-Hieber, Johannes

arXiv.org Artificial IntelligenceNov-26-2024

Forward gradient descent (FGD) has been proposed as a biologically more plausible alternative of gradient descent as it can be computed without backward pass. Considering the linear model with $d$ parameters, previous work has found that the prediction error of FGD is, however, by a factor $d$ slower than the prediction error of stochastic gradient descent (SGD). In this paper we show that by computing $\ell$ FGD steps based on each training sample, this suboptimality factor becomes $d/(\ell \wedge d)$ and thus the suboptimality of the rate disappears if $\ell \gtrsim d.$ We also show that FGD with repeated sampling can adapt to low-dimensional structure in the input distribution. The main mathematical challenge lies in controlling the dependencies arising from the repeated sampling process.

fgd, gradient descent, mspe, (14 more...)

arXiv.org Artificial Intelligence

2411.17567

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > District of Columbia > Washington (0.04)
North America > Canada > Quebec > Montreal (0.04)
(5 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Invariant Subspace Decomposition

Lazzaretto, Margherita, Peters, Jonas, Pfister, Niklas

arXiv.org Machine LearningApr-15-2024

We consider the task of predicting a response Y from a set of covariates X in settings where the conditional distribution of Y given X changes over time. For this to be feasible, assumptions on how the conditional distribution changes over time are required. Existing approaches assume, for example, that changes occur smoothly over time so that short-term prediction using only the recent past becomes feasible. In this work, we propose a novel invariance-based framework for linear conditionals, called Invariant Subspace Decomposition (ISD), that splits the conditional distribution into a time-invariant and a residual time-dependent component. As we show, this decomposition can be utilized both for zero-shot and time-adaptation prediction tasks, that is, settings where either no or a small amount of training data is available at the time points we want to predict Y at, respectively. We propose a practical estimation procedure, which automatically infers the decomposition using tools from approximate joint matrix diagonalization. Furthermore, we provide finite sample guarantees for the proposed estimator and demonstrate empirically that it indeed improves on approaches that do not use the additional invariant structure.

inv, res, var, (15 more...)

arXiv.org Machine Learning

2404.09962

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Supplementary Materials of "BAST: Bayesian Additive Regression Spanning Trees for Complex Constrained Domain "

Neural Information Processing SystemsMar-3-2024, 06:00:40 GMT

These appendices provide supplementary details and results of BAST. Appendix A contains additional details on Bayesian estimation and prediction. Supplementary simulation details and results including hyperparameter tuning and computation time can be found in Appendix B. Finally, Appendix C provides the proof of Proposition 1. Appendix A.1 Estimation This appendix provides details on the Markov chain Monte Carlo (MCMC) algorithm discussed in Section 3.1. This probability specification works well in our experiments, but one can modify it if desired. Appendix A.2 Prediction in Two-dimensional Constrained Domains In this subsection we provide details on specifying the neighbor set N To sample the cluster membership of u, we need to determine the cluster memberships for vertices on the domain boundary, which can be done by, for instance, assigning a boundary vertex to the same cluster as its nearest vertex in S with respect to the graph distance in the CDT mesh (when the number of vertices in the CDT graph is large, we expect this to well approximate the geodesic distance).

bart, bast, hyperparameter, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Texas > Brazos County > College Station (0.05)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.41)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Sparse high-dimensional linear mixed modeling with a partitioned empirical Bayes ECM algorithm

Zgodic, Anja, Bai, Ray, Zhang, Jiajia, McLain, Alexander C.

arXiv.org Machine LearningOct-18-2023

While high-dimensional data has been ubiquitous for some time, the use of longitudinal high-dimensional data or grouped (clustered) high-dimensional data has been recently increasing in research. For example, some genetic studies gather gene expression levels for an individual on multiple occasions in response to an exposure over time (Banchereau et al., 2016). Other ongoing studies - like the UK Biobank and the Adolescent Brain Cognitive Development Study - collect high-dimensional genetic/imaging information longitudinally to learn how individual changes in these markers are related to outcomes (Cole, 2020; Saragosa-Harris et al., 2022). Such data usually violates the traditional linear regression assumption that observations are independently and identically distributed. Data analysis should account for the dependence between observations belonging to the same individual. For the low dimensional setting where n p, extensive methodology is available for handling such data structures, e.g., linear mixed models (LMMs). The fields of LMMs and high-dimensional linear regression have extensive bodies of literature. However, they are largely separate, with a very narrow body of literature existing at the intersection of LMMs and high-dimensional longitudinal data (where p n). Unlike low-dimensional (p n) LMMs for which restricted maximum likelihood (REML) methods are readily available, fitting high-dimensional LMMs is considerably more challenging due to the non-convexity of the optimization function, which requires the inversion of large matrices in addition to iterative approaches. The few available methods for highdimensional LMMs rely on sparsity-inducing penalizations (e.g.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

2310.12285

Country: