AITopics | bic

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Kosovo > District of Gjilan > Kamenica (0.04)
Europe > Germany (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

arXiv.org Machine LearningJan-6-2026

Evidence Slopes and Effective Dimension in Singular Linear Models

Rao, Kalyaan

Bayesian model selection commonly relies on Laplace approximation or the Bayesian Information Criterion (BIC), which assume that the effective model dimension equals the number of parameters. Singular learning theory replaces this assumption with the real log canonical threshold (RLCT), an effective dimension that can be strictly smaller in overparameterized or rank-deficient models. We study linear-Gaussian rank models and linear subspace (dictionary) models in which the exact marginal likelihood is available in closed form and the RLCT is analytically tractable. In this setting, we show theoretically and empirically that the error of Laplace/BIC grows linearly with (d/2 minus lambda) times log n, where d is the ambient parameter dimension and lambda is the RLCT. An RLCT-aware correction recovers the correct evidence slope and is invariant to overcomplete reparameterizations that represent the same data subspace. Our results provide a concrete finite-sample characterization of Laplace failure in singular models and demonstrate that evidence slopes can be used as a practical estimator of effective dimension in simple linear settings.

artificial intelligence, likelihood, machine learning, (18 more...)

2601.01238

Country: North America > United States (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

Hellstern, Michael, Shojaie, Ali

Order Selection in Vector Autoregression by Mean Square Information Criterion

arXiv.org Machine LearningNov-26-2025

Vector autoregressive (VAR) processes are ubiquitously used in economics, finance, and biology. Order selection is an essential step in fitting VAR models. While many order selection methods exist, all come with weaknesses. Order selection by minimizing AIC is a popular approach but is known to consistently overestimate the true order for processes of small dimension. On the other hand, methods based on BIC or the Hannan-Quinn (HQ) criteria are shown to require large sample sizes in order to accurately estimate the order for larger-dimensional processes. We propose the mean square information criterion (MIC) based on the observation that the expected squared error loss is flat once the fitted order reaches or exceeds the true order. MIC is shown to consistently estimate the order of the process under relatively mild conditions. Our simulation results show that MIC offers better performance relative to AIC, BIC, and HQ under misspecification. This advantage is corroborated when forecasting COVID-19 outcomes in New York City. Order selection by MIC is implemented in the micvar R package available on CRAN.

matrix, order selection method, var, (14 more...)

2511.19761

Country:

North America > United States > New York (0.24)
North America > Mexico (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Banking & Finance > Trading (1.00)
Health & Medicine > Therapeutic Area (0.87)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.40)

Neural Information Processing SystemsOct-8-2025, 04:23:32 GMT

1577ea3eaf8dacb99f64e4496c3ecddf-Paper-Conference.pdf

artificial intelligence, experiment, machine learning, (20 more...)

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Kosovo > District of Gjilan > Kamenica (0.04)
(2 more...)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Xu, Shuangshuang, Ferreira, Marco A. R., Tegge, Allison N.

What is in the model? A Comparison of variable selection criteria and model search approaches

arXiv.org Machine LearningOct-6-2025

What is in the model? Abstract For many scientific questions, understanding the underlying mechanism is the goal. To help investigators better understand the underlying mechanism, variable selection is a crucial step that permits the identification of the most associated regression variables of interest. A variable selection method consists of model evaluation using an information criterion and a search of the model space. Here, we provide a comprehensive comparison of variable selection methods using performance measures of correct identification rate (CIR), recall, and false discovery rate (FDR). We consider the BIC and AIC for evaluating models, and exhaustive, greedy, LASSO path, and stochastic search approaches for searching the model space; we also consider LASSO using cross validation. We perform simulation studies for linear and generalized linear models that parametrically explore a wide range of realistic sample sizes, effect sizes, and correlations among regression variables. We consider model spaces with a small and larger number of potential regressors. The results show that the exhaustive search BIC and stochastic search BIC outperform the other methods when considering the performance measures on small and large model spaces, respectively. These approaches result in the highest CIR and lowest FDR, which collectively may support long-term efforts towards increasing replicability in research.

model space, regression variable, sample size, (14 more...)

2510.02628

Country:

North America > United States > Virginia > Roanoke (0.04)
North America > United States > Virginia > Montgomery County > Blacksburg (0.04)

Genre: Research Report > New Finding (0.90)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.90)

Mauro Scanagatta, Cassio P. de Campos, Giorgio Corani, Marco Zaffalon

Learning Bayesian Networks with Thousands of Variables

Neural Information Processing SystemsOct-2-2025, 02:13:06 GMT

We present a method for learning Bayesian networks from data sets containing thousands of variables without the need for structure constraints. Our approach is made of two parts. The first is a novel algorithm that effectively explores the space of possible parent sets of a node. It guides the exploration towards the most promising parent sets on the basis of an approximated score function that is computed in constant time. The second part is an improvement of an existing ordering-based algorithm for structure optimization. The new algorithm provably achieves a higher score compared to its original formulation. Our novel approach consistently outperforms the state of the art on very large data sets.

bic, bic score, identification, (16 more...)

Country:

Europe > Switzerland (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Europe > United Kingdom > Northern Ireland > County Down > Belfast (0.04)
Europe > United Kingdom > Northern Ireland > County Antrim > Belfast (0.04)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Senthilnath, J., G, Nagaraj, C, Sumanth Simha, Kulkarni, Sushant, Thapa, Meenakumari, M, Indiramma, Benediktsson, Jón Atli

DRBM-ClustNet: A Deep Restricted Boltzmann-Kohonen Architecture for Data Clustering

arXiv.org Artificial IntelligenceJul-8-2025

A Bayesian Deep Restricted Boltzmann-Kohonen architecture for data clustering termed as DRBM-ClustNet is proposed. This core-clustering engine consists of a Deep Restricted Boltzmann Machine (DRBM) for processing unlabeled data by creating new features that are uncorrelated and have large variance with each other. Next, the number of clusters are predicted using the Bayesian Information Criterion (BIC), followed by a Kohonen Network-based clustering layer. The processing of unlabeled data is done in three stages for efficient clustering of the non-linearly separable datasets. In the first stage, DRBM performs non-linear feature extraction by capturing the highly complex data representation by projecting the feature vectors of $d$ dimensions into $n$ dimensions. Most clustering algorithms require the number of clusters to be decided a priori, hence here to automate the number of clusters in the second stage we use BIC. In the third stage, the number of clusters derived from BIC forms the input for the Kohonen network, which performs clustering of the feature-extracted data obtained from the DRBM. This method overcomes the general disadvantages of clustering algorithms like the prior specification of the number of clusters, convergence to local optima and poor clustering accuracy on non-linear datasets. In this research we use two synthetic datasets, fifteen benchmark datasets from the UCI Machine Learning repository, and four image datasets to analyze the DRBM-ClustNet. The proposed framework is evaluated based on clustering accuracy and ranked against other state-of-the-art clustering methods. The obtained results demonstrate that the DRBM-ClustNet outperforms state-of-the-art clustering algorithms.

artificial intelligence, dataset, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TNNLS.2022.3190439

2205.06697

Country:

Asia > India (0.28)
Europe (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningMay-21-2025

A system identification approach to clustering vector autoregressive time series

Yue, Zuogong, Wang, Xinyi, Solo, Victor

Clustering of time series based on their underlying dynamics is keeping attracting researchers due to its impacts on assisting complex system modelling. Most current time series clustering methods handle only scalar time series, treat them as white noise, or rely on domain knowledge for high-quality feature construction, where the autocorrelation pattern/feature is mostly ignored. Instead of relying on heuristic feature/metric construction, the system identification approach allows treating vector time series clustering by explicitly considering their underlying autoregressive dynamics. We first derive a clustering algorithm based on a mixture autoregressive model. Unfortunately it turns out to have significant computational problems. We then derive a `small-noise' limiting version of the algorithm, which we call k-LMVAR (Limiting Mixture Vector AutoRegression), that is computationally manageable. We develop an associated BIC criterion for choosing the number of clusters and model order. The algorithm performs very well in comparative simulations and also scales well computationally.

artificial intelligence, data mining, machine learning, (18 more...)

2505.14421

Country:

Asia > China (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > California (0.04)
Europe > Monaco (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Radmard, Puria, Bays, Paul M., Lengyel, Máté

A flexible Bayesian non-parametric mixture model reveals multiple dependencies of swap errors in visual working memory

arXiv.org Artificial IntelligenceMay-5-2025

Human behavioural data in psychophysics has been used to elucidate the underlying mechanisms of many cognitive processes, such as attention, sensorimotor integration, and perceptual decision making. Visual working memory has particularly benefited from this approach: analyses of VWM errors have proven crucial for understanding VWM capacity and coding schemes, in turn constraining neural models of both. One poorly understood class of VWM errors are swap errors, whereby participants recall an uncued item from memory. Swap errors could arise from erroneous memory encoding, noisy storage, or errors at retrieval time - previous research has mostly implicated the latter two. However, these studies made strong a priori assumptions on the detailed mechanisms and/or parametric form of errors contributed by these sources. Here, we pursue a data-driven approach instead, introducing a Bayesian non-parametric mixture model of swap errors (BNS) which provides a flexible descriptive model of swapping behaviour, such that swaps are allowed to depend on both the probed and reported features of every stimulus item. We fit BNS to the trial-by-trial behaviour of human participants and show that it recapitulates the strong dependence of swaps on cue similarity in multiple datasets. Critically, BNS reveals that this dependence coexists with a non-monotonic modulation in the report feature dimension for a random dot motion direction-cued, location-reported dataset. The form of the modulation inferred by BNS opens new questions about the importance of memory encoding in causing swap errors in VWM, a distinct source to the previously suggested binding and cueing errors. Our analyses, combining qualitative comparisons of the highly interpretable BNS parameter structure with rigorous quantitative model comparison and recovery methods, show that previous interpretations of swap errors may have been incomplete.

artificial intelligence, machine learning, swap error, (19 more...)

arXiv.org Artificial Intelligence

2505.01178

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Neural Information Processing SystemsFeb-6-2025, 15:33:29 GMT

Export Reviews, Discussions, Author Feedback and Meta-Reviews

The paper describes tricks to scale Bayesian network structure learning to thousands of variables. This is achieved by developing new heuristics for candidate parent set identification and the subsequent order based structure optimization. In general, the paper is clearly written and easy to read. There are issues in editing and style, but the problems do not affect readability (much). The suggested heuristics feel bit ad-hoc, thus the value of the work is eventually judged by empirical evaluation.

author feedback and meta-review, discussion, export review, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.58)