AITopics | statistical subset selection problem

Fast Parallel Algorithms for Statistical Subset Selection Problems

Neural Information Processing SystemsDec-25-2025, 21:02:52 GMT

In this paper, we propose a new framework for designing fast parallel algorithms for fundamental statistical subset selection tasks that include feature selection and experimental design. Such tasks are known to be weakly submodular and are amenable to optimization via the standard greedy algorithm. Despite its desirable approximation guarantees, however, the greedy algorithm is inherently sequential and in the worst case, its parallel runtime is linear in the size of the data. Recently, there has been a surge of interest in a parallel optimization technique called adaptive sampling which produces solutions with desirable approximation guarantees for submodular maximization in exponentially faster parallel runtime. Unfortunately, we show that for general weakly submodular functions such accelerations are impossible. The major contribution in this paper is a novel relaxation of submodularity which we call differential submodularity. We first prove that differential submodularity characterizes objectives like feature selection and experimental design. We then design an adaptive sampling algorithm for differentially submodular functions whose parallel runtime is logarithmic in the size of the data and achieves strong approximation guarantees. Through experiments, we show the algorithm's performance is competitive with state-of-the-art methods and obtains dramatic speedups for feature selection and experimental design problems.

fast parallel algorithm, name change, statistical subset selection problem, (8 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.86)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.83)

Add feedback

Reviews: Fast Parallel Algorithms for Statistical Subset Selection Problems

Neural Information Processing SystemsJan-26-2025, 11:13:18 GMT

The authors propose a relaxation of submodularity, called differential submodularity, where the marginal gains can be bounded by two submodular functions. They use this concept to provide approximation guarantees for a parallel algorithm, namely adaptive sampling, for maximizing weak submodular functions, and show its applicability to parallel feature selection and experimental design. Overall the paper is well written and the problem is well motivated. The main motivation for parallelized algorithms is their applicability to large datasets. Although we see some speedup for relatively small datasets in the experiments, my main concern is that due to the large number of rounds in the worst case and large sample complexity, the algorithm may not scale to large datasets, especially in the actual distributed setting, (e.g.

algorithm, dataset, statistical subset selection problem, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Architecture > Distributed Systems (0.62)
Information Technology > Artificial Intelligence (0.38)

Add feedback

Fast Parallel Algorithms for Statistical Subset Selection Problems

Neural Information Processing SystemsOct-10-2024, 17:50:53 GMT

In this paper, we propose a new framework for designing fast parallel algorithms for fundamental statistical subset selection tasks that include feature selection and experimental design. Such tasks are known to be weakly submodular and are amenable to optimization via the standard greedy algorithm. Despite its desirable approximation guarantees, however, the greedy algorithm is inherently sequential and in the worst case, its parallel runtime is linear in the size of the data. Recently, there has been a surge of interest in a parallel optimization technique called adaptive sampling which produces solutions with desirable approximation guarantees for submodular maximization in exponentially faster parallel runtime. Unfortunately, we show that for general weakly submodular functions such accelerations are impossible.

algorithm, fast parallel algorithm, statistical subset selection problem, (7 more...)

Neural Information Processing Systems

Genre: Research Report (0.52)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
Information Technology > Architecture > Distributed Systems (0.65)

Add feedback

Fast Parallel Algorithms for Statistical Subset Selection Problems

Qian, Sharon, Singer, Yaron

Neural Information Processing SystemsMar-18-2020, 22:32:23 GMT

In this paper, we propose a new framework for designing fast parallel algorithms for fundamental statistical subset selection tasks that include feature selection and experimental design. Such tasks are known to be weakly submodular and are amenable to optimization via the standard greedy algorithm. Despite its desirable approximation guarantees, however, the greedy algorithm is inherently sequential and in the worst case, its parallel runtime is linear in the size of the data. Recently, there has been a surge of interest in a parallel optimization technique called adaptive sampling which produces solutions with desirable approximation guarantees for submodular maximization in exponentially faster parallel runtime. Unfortunately, we show that for general weakly submodular functions such accelerations are impossible.

algorithm, fast parallel algorithm, statistical subset selection problem, (7 more...)

Neural Information Processing Systems

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.87)
Information Technology > Architecture > Distributed Systems (0.65)
Information Technology > Artificial Intelligence > Machine Learning (0.56)

Add feedback

Filters

Collaborating Authors

statistical subset selection problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Fast Parallel Algorithms for Statistical Subset Selection Problems

Reviews: Fast Parallel Algorithms for Statistical Subset Selection Problems

Fast Parallel Algorithms for Statistical Subset Selection Problems

Fast Parallel Algorithms for Statistical Subset Selection Problems