AITopics | scalability

Collaborating Authors

scalability

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Necessary and Sufficient Conditions for Optimal Decision Trees using Dynamic Programming

Neural Information Processing SystemsApr-25-2026, 15:14:43 GMT

artificial intelligence, machine learning, optimization task, (17 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.27)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Add feedback

Scaling Gaussian Processes with Derivative Information Using Variational Inference

Neural Information Processing SystemsApr-25-2026, 09:57:07 GMT

Gaussian processes with derivative information are useful in many settings where derivative information is available, including numerous Bayesian optimization and regression tasks that arise in the natural sciences. Incorporating derivative observations, however, comes with a dominating O(N3D3) computational cost when training on N points in D input dimensions. This is intractable for even moderately sized problems. While recent work has addressed this intractability in the low-Dsetting, the high-N, high-Dsetting is still unexplored and of great value, particularly as machine learning problems increasingly become high dimensional. In this paper, we introduce methods to achieve fully scalable Gaussian process regression with derivatives using variational inference. Analogous to the use of inducing values to sparsify the labels of a training set, we introduce the concept of inducing directional derivatives to sparsify the partial derivative information of a training set. This enables us to construct a variational posterior that incorporates derivative information but whose size depends neither on the full dataset size N nor the full dimensionality D. We demonstrate the full scalability of our approach on a variety of tasks, ranging from a high dimensional stellarator fusion regression task to training graph convolutional neural networks on Pubmed using Bayesian optimization. Surprisingly, we find that our approach can improve regression performance even in settings where only label data is available.

artificial intelligence, machine learning, optimization problem, (15 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre: Research Report (0.68)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)

Add feedback

Select-and-Sample for Spike-and-Slab Sparse Coding

Abdul-Saboor Sheikh, Jörg Lücke

Neural Information Processing SystemsApr-22-2026, 10:33:30 GMT

Probabilistic inference serves as a popular model for neural processing. It is still unclear, however, how approximate probabilistic inference can be accurate and scalable to very high-dimensional continuous latent spaces. Especially as typical posteriors for sensory data can be expected to exhibit complex latent dependencies including multiple modes. Here, we study an approach that can efficiently be scaled while maintaining a richly structured posterior approximation under these conditions. As example model we use spike-and-slab sparse coding for V1 processing, and combine latent subspace selection with Gibbs sampling (selectand-sample).

artificial intelligence, machine learning, sparse, (17 more...)

Neural Information Processing Systems

Country: Europe > Germany (0.28)

Genre: Research Report (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Enhancing Online Support Group Formation Using Topic Modeling Techniques

Barman, Pronob Kumar, Reynolds, Tera L., Foulds, James

arXiv.org Machine LearningMar-31-2026

Online health communities (OHCs) are vital for fostering peer support and improving health outcomes. Support groups within these platforms can provide more personalized and cohesive peer support, yet traditional support group formation methods face challenges related to scalability, static categorization, and insufficient personalization. To overcome these limitations, we propose two novel machine learning models for automated support group formation: the Group specific Dirichlet Multinomial Regression (gDMR) and the Group specific Structured Topic Model (gSTM). These models integrate user generated textual content, demographic profiles, and interaction data represented through node embeddings derived from user networks to systematically automate personalized, semantically coherent support group formation. We evaluate the models on a large scale dataset from MedHelp, comprising over 2 million user posts. Both models substantially outperform baseline methods including LDA, DMR, and STM in predictive accuracy (held out log likelihood), semantic coherence (UMass metric), and internal group consistency. The gDMR model yields group covariates that facilitate practical implementation by leveraging relational patterns from network structures and demographic data. In contrast, gSTM emphasizes sparsity constraints to generate more distinct and thematically specific groups. Qualitative analysis further validates the alignment between model generated groups and manually coded themes, showing the practical relevance of the models in informing groups that address diverse health concerns such as chronic illness management, diagnostic uncertainty, and mental health. By reducing reliance on manual curation, these frameworks provide scalable solutions that enhance peer interactions within OHCs, with implications for patient engagement, community resilience, and health outcomes.

machine learning, manuscriptsubmittedtoacm, natural language, (17 more...)

arXiv.org Machine Learning

2603.24765

Country:

Europe > Lithuania (0.05)
Oceania > Kiribati (0.04)
Oceania > Australia (0.04)
(47 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.94)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Rethinking Exploration in Reinforcement Learning with Effective Metric-Based Exploration Bonus

Neural Information Processing SystemsMar-21-2026, 00:04:20 GMT

Enhancing exploration in reinforcement learning (RL) through the incorporation of intrinsic rewards, specifically by leveraging *state discrepancy* measures within various metric spaces as exploration bonuses, has emerged as a prevalent strategy to encourage agents to visit novel states. The critical factor lies in how to quantify the difference between adjacent states as *novelty* for promoting effective exploration.Nonetheless, existing methods that evaluate state discrepancy in the latent space under $L_1$ or $L_2$ norm often depend on count-based episodic terms as scaling factors for exploration bonuses, significantly limiting their scalability. Additionally, methods that utilize the bisimulation metric for evaluating state discrepancies face a theory-practice gap due to improper approximations in metric learning, particularly struggling with *hard exploration* tasks. To overcome these challenges, we introduce the **E**ffective **M**etric-based **E**xploration-bonus (EME). EME critically examines and addresses the inherent limitations and approximation inaccuracies of current metric-based state discrepancy methods for exploration, proposing a robust metric for state discrepancy evaluation backed by comprehensive theoretical analysis. Furthermore, we propose the diversity-enhanced scaling factor integrated into the exploration bonus to be dynamically adjusted by the variance of prediction from an ensemble of reward models, thereby enhancing exploration effectiveness in particularly challenging scenarios. Extensive experiments are conducted on hard exploration tasks within Atari games, Minigrid, Robosuite, and Habitat, which illustrate our method's scalability to various scenarios.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.59)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Scaling White-Box Transformers for Vision

Neural Information Processing SystemsMar-19-2026, 21:42:01 GMT

CRATE, a white-box transformer architecture designed to learn compressed and sparse representations, offers an intriguing alternative to standard vision transformers (ViTs) due to its inherent mathematical interpretability. Despite extensive investigations into the scaling behaviors of language and vision transformers, the scalability of CRATE remains an open question which this paper aims to address. Specifically, we propose CRATE-$\alpha$, featuring strategic yet minimal modifications to the sparse coding block in the CRATE architecture design, and a light training recipe designed to improve the scalability of CRATE.Through extensive experiments, we demonstrate that CRATE-$\alpha$ can effectively scale with larger model sizes and datasets. For example, our CRATE-$\alpha$-B substantially outperforms the prior best CRATE-B model accuracy on ImageNet classification by 3.7%, achieving an accuracy of 83.2%.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision (0.63)
Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

Sparsified SGD with Memory

Neural Information Processing SystemsMar-17-2026, 00:03:20 GMT

Huge scale machine learning problems are nowadays tackled by distributed optimization algorithms, i.e. algorithms that leverage the compute power of many devices for training. The communication overhead is a key bottleneck that hinders perfect scalability. Various recent works proposed to use quantization or sparsification techniques to reduce the amount of data that needs to be communicated, for instance by only sending the most significant entries of the stochastic gradient (top-k sparsification). Whilst such schemes showed very promising performance in practice, they have eluded theoretical analysis so far. In this work we analyze Stochastic Gradient Descent (SGD) with k-sparsification or compression (for instance top-k or random-k) and show that this scheme converges at the same rate as vanilla SGD when equipped with error compensation (keeping track of accumulated errors in memory). That is, communication can be reduced by a factor of the dimension of the problem (sometimes even more) whilst still converging at the same rate. We present numerical experiments to illustrate the theoretical findings and the good scalability for distributed applications.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.84)

Add feedback

Scalable Robust Matrix Factorization with Nonconvex Loss

Neural Information Processing SystemsMar-16-2026, 19:32:21 GMT

Robust matrix factorization (RMF), which uses the $\ell_1$-loss, often outperforms standard matrix factorization using the $\ell_2$-loss, particularly when outliers are present. The state-of-the-art RMF solver is the RMF-MM algorithm, which, however, cannot utilize data sparsity. Moreover, sometimes even the (convex) $\ell_1$-loss is not robust enough. In this paper, we propose the use of nonconvex loss to enhance robustness. To address the resultant difficult optimization problem, we use majorization-minimization (MM) optimization and propose a new MM surrogate. To improve scalability, we exploit data sparsity and optimize the surrogate via its dual with the accelerated proximal gradient algorithm. The resultant algorithm has low time and space complexities and is guaranteed to converge to a critical point. Extensive experiments demonstrate its superiority over the state-of-the-art in terms of both accuracy and scalability.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

59e02a1440e6667e01628ed4c325255c-Paper-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 03:53:51 GMT

objective, optimization, probability, (16 more...)

Neural Information Processing Systems

Country:

North America > Cuba > Holguín Province > Holguín (0.04)
Asia > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.68)

Add feedback

Scalable Structure Learning of Continuous-Time Bayesian Networks from Incomplete Data

Dominik Linzner, Michael Schmidt, Heinz Koeppl

Neural Information Processing SystemsFeb-14-2026, 11:44:41 GMT

Instead ofsampling andscoring allpossible structures individually,we assume the generator of the CTBN to be composed as a mixture of generators stemming from different structures.

artificial intelligence, likelihood, machine learning, (17 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback