AITopics | Data Science

Collaborating Authors

Data Science

News Overviews Instructional Materials AI-Alerts Classics

Learning-Based Low-Rank Approximations

Neural Information Processing SystemsMay-23-2025, 09:17:58 GMT

We introduce a "learning-based" algorithm for the low-rank decomposition problem: given an n d matrix A, and a parameter k, compute a rank-k matrix A

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Wisconsin > Dane County > Madison (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Data Science > Data Mining (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

7 Appendix A Limitations

Neural Information Processing SystemsMay-23-2025, 06:16:09 GMT

Table 6 provides summary statistics of domain coverage. Overall, the benchmark covers 8,637 biology images and 8,678 pathology images across 12 subdomains. Similarly, Table 7 shows summary statistics of microscopy modalities covered by Micro-Bench perception, including 10,864 images for light microscopy, 5,618 for fluorescence microscopy, and 833 images for electron microscopy across 8 microscopy imaging submodalities and 25 unique microscopy staining techniques (see Table 8). Micro-Bench Perception (Coarse-grained): Hierarchical metadata for each of the 17,235 perception images and task-specific templates (shown in Table 23) are used to create 5 coarse-grained questions and captions regarding microscopy modality, submodality, domain, subdomain, and staining technique. The use of hierarchical metadata enables the generation of options within each hierarchical level.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)
(3 more...)

Add feedback

Topological Attention for Time Series Forecasting

Neural Information Processing SystemsMay-23-2025, 06:02:28 GMT

The problem of (point) forecasting univariate time series is considered. Most approaches, ranging from traditional statistical methods to recent learning-based techniques with neural networks, directly operate on raw time series observations. As an extension, we study whether local topological properties, as captured via persistent homology, can serve as a reliable signal that provides complementary information for learning to forecast. To this end, we propose topological attention, which allows attending to local topological features within a time horizon of historical data. Our approach easily integrates into existing end-to-end trainable forecasting models, such as N-BEATS, and, in combination with the latter, exhibits state-of-the-art performance on the large-scale M4 benchmark dataset of 100,000 diverse time series from different domains. Ablation experiments, as well as a comparison to a broad range of forecasting methods in a setting where only a single time series is available for training, corroborate the beneficial nature of including local topological information through an attention mechanism.

artificial intelligence, data mining, machine learning, (17 more...)

Neural Information Processing Systems

Country: Europe > Austria (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Constrained Sampling with Primal-Dual Langevin Monte Carlo

Neural Information Processing SystemsMay-23-2025, 04:16:11 GMT

This work considers the problem of sampling from a probability distribution known up to a normalization constant while satisfying a set of statistical constraints specified by the expected values of general nonlinear functions. This problem finds applications in, e.g., Bayesian inference, where it can constrain moments to evaluate counterfactual scenarios or enforce desiderata such as prediction fairness. Methods developed to handle support constraints, such as those based on mirror maps, barriers, and penalties, are not suited for this task. This work therefore relies on gradient descent-ascent dynamics in Wasserstein space to put forward a discretetime primal-dual Langevin Monte Carlo algorithm (PD-LMC) that simultaneously constrains the target distribution and samples from it. We analyze the convergence of PD-LMC under standard assumptions on the target distribution and constraints, namely (strong) convexity and log-Sobolev inequalities. To do so, we bring classical optimization arguments for saddle-point algorithms to the geometry of Wasserstein space. We illustrate the relevance and effectiveness of PD-LMC in several applications.

artificial intelligence, bayesian inference, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States (0.27)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Banking & Finance (0.67)
Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Heterogeneous Multi-player Multi-armed Bandits: Closing the Gap and Generalization

Neural Information Processing SystemsMay-22-2025, 21:06:53 GMT

Despite the significant interests and many progresses in decentralized multi-player multi-armed bandits (MP-MAB) problems in recent years, the regret gap to the natural centralized lower bound in the heterogeneous MP-MAB setting remains open. In this paper, we propose BEACON - Batched Exploration with Adaptive COmmunicatioN - that closes this gap. BEACON accomplishes this goal with novel contributions in implicit communication and efficient exploration. For the former, we propose a novel adaptive differential communication (ADC) design that significantly improves the implicit communication efficiency. For the latter, a carefully crafted batched exploration scheme is developed to enable incorporation of the combinatorial upper confidence bound (CUCB) principle. We then generalize the existing linear-reward MP-MAB problems, where the system reward is always the sum of individually collected rewards, to a new MP-MAB problem where the system reward is a general (nonlinear) function of individual rewards. We extend BEACON to solve this problem and prove a logarithmic regret. BEACON bridges the algorithm design and regret analysis of combinatorial MAB (CMAB) and MP-MAB, two largely disjointed areas in MAB, and the results in this paper suggest that this previously ignored connection is worth further investigation.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States (0.28)
Europe > Italy > Sicily (0.14)

Genre: Research Report (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.86)

Add feedback

Improved Coresets and Sublinear Algorithms for Power Means in Euclidean Spaces Vincent Cohen-Addad David Saulpic Chris Schwiegelshohn

Neural Information Processing SystemsMay-22-2025, 12:08:04 GMT

Special cases of problem include the well-known Fermat-Weber problem - or geometric median problem - where z = 1, the mean or centroid where z = 2, and the Minimum Enclosing Ball problem, where z = . We consider these problem in the big data regime. Here, we are interested in sampling as few points as possible such that we can accurately estimate m. More specifically, we consider sublinear algorithms as well as coresets for these problems. Sublinear algorithms have a random query access to the set A and the goal is to minimize the number of queries.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > New York (0.28)
North America > United States > California (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.88)

Add feedback

C2FAR: Coarse-to-Fine Autoregressive Networks for Precise Probabilistic Forecasting

Neural Information Processing SystemsMay-22-2025, 10:08:38 GMT

C2FAR generates a hierarchical, coarse-to-fine discretization of a variable autoregressively; progressively finer intervals of support are generated from a sequence of binned distributions, where each distribution is conditioned on previously-generated coarser intervals. Unlike prior (flat) binned distributions, C2FAR can represent values with exponentially higher precision, for only a linear increase in complexity. We use C2FAR for probabilistic forecasting via a recurrent neural network, thus modeling time series autoregressively in both space and time. C2FAR is the first method to simultaneously handle discrete and continuous series of arbitrary scale and distribution shape. This flexibility enables a variety of time series use cases, including anomaly detection, interpolation, and compression. C2FAR achieves improvements over the state-of-the-art on several benchmark forecasting datasets.

data mining, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Tight Lower Bound and Efficient Reduction for Swap Regret

Neural Information Processing SystemsMay-22-2025, 08:37:12 GMT

Swap regret, a generic performance measure of online decision-making algorithms, plays an important role in the theory of repeated games, along with a close connection to correlated equilibria in strategic games. This paper shows an (p TN log N)-lower bound for swap regret, where T and N denote the numbers of time steps and available actions, respectively. Our lower bound is tight up to a constant, and resolves an open problem mentioned, e.g., in the book by Nisan et al. [28]. Besides, we present a computationally efficient reduction method that converts no-external-regret algorithms to no-swap-regret algorithms. This method can be applied not only to the full-information setting but also to the bandit setting and provides a better regret bound than previous results.

artificial intelligence, data mining, machine learning, (21 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England (0.14)
North America > Canada (0.14)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.47)

Add feedback

SkinCon: A skin disease dataset densely annotated by domain experts for fine-grained model debugging and analysis Roberto Novoa

Neural Information Processing SystemsMay-22-2025, 00:54:16 GMT

However, there are only a few datasets that include concept-level meta-labels and most of these meta-labels are relevant for natural images that do not require domain expertise. Previous densely annotated datasets in medicine focused on meta-labels that are relevant to a single disease such as osteoarthritis or melanoma. In dermatology, skin disease is described using an established clinical lexicon that allows clinicians to describe physical exam findings to one another. To provide a medical dataset densely annotated by domain experts with annotations useful across multiple disease processes, we developed SkinCon: a skin disease dataset densely annotated by dermatologists. SkinCon includes 3230 images from the Fitzpatrick 17k skin disease dataset densely annotated with 48 clinical concepts, 22 of which have at least 50 images representing the concept. The concepts used were chosen by two dermatologists considering the clinical descriptor terms used to describe skin lesions.

artificial intelligence, dataset, machine learning, (12 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report (0.47)

Industry:

Health & Medicine > Therapeutic Area > Dermatology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Data Science (0.68)

Add feedback

Efficient Streaming Algorithms for Graphlet Sampling Marco Bressan Cispa Helmholtz Center for Information Security Department of Computer Science Saarland University

Neural Information Processing SystemsMay-21-2025, 23:23:55 GMT

Given a graph G and a positive integer k, the Graphlet Sampling problem asks to sample a connected induced k-vertex subgraph of G uniformly at random. Graphlet sampling enhances machine learning applications by transforming graph structures into feature vectors for tasks such as graph classification and subgraph identification, boosting neural network performance, and supporting clustered federated learning by capturing local structures and relationships.

artificial intelligence, data mining, machine learning, (18 more...)

Neural Information Processing Systems

Country: