AITopics | selective

2512.07993

Country: North America > United States > California (0.28)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.48)

arXiv.org Artificial IntelligenceAug-7-2025

SenseCrypt: Sensitivity-guided Selective Homomorphic Encryption for Joint Federated Learning in Cross-Device Scenarios

Li, Borui, Yan, Li, Han, Junhao, Liu, Jianmin, Yu, Lei

Homomorphic Encryption (HE) prevails in securing Federated Learning (FL), but suffers from high overhead and adaptation cost. Selective HE methods, which partially encrypt model parameters by a global mask, are expected to protect privacy with reduced overhead and easy adaptation. However, in cross-device scenarios with heterogeneous data and system capabilities, traditional Selective HE methods deteriorate client straggling, and suffer from degraded HE overhead reduction performance. Accordingly, we propose SenseCrypt, a Sen sitivity-guided se lective Homomor-phic En Crypt ion framework, to adaptively balance security and HE overhead per cross-device FL client. Given the observation that model parameter sensitivity is effective for measuring clients' data distribution similarity, we first design a privacy-preserving method to respectively cluster the clients with similar data distributions. Then, we develop a scoring mechanism to deduce the straggler-free ratio of model parameters that can be encrypted by each client per cluster. Finally, for each client, we formulate and solve a multi-objective model parameter selection optimization problem, which minimizes HE overhead while maximizing model security without causing straggling. Experiments demonstrate that Sense-Crypt ensures security against the state-of-the-art inversion attacks, while achieving normal model accuracy as on IID data, and reducing training time by 58.4% 88.7% as compared to traditional HE methods.

artificial intelligence, machine learning, model parameter, (16 more...)

2508.041

Country:

Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
North America > United States > New York > Rensselaer County > Troy (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsFeb-11-2025, 18:04:06 GMT

Selective Labeling via Error Bound Minimization

Quanquan Gu, Tong Zhang, Jiawei Han, Chris H. Ding

In many practical machine learning problems, the acquisition of labeled data is often expensive and/or time consuming. This motivates us to study a problem as follows: given a label budget, how to select data points to label such that the learning performance is optimized. We propose a selective labeling method by analyzing the out-of-sample error of Laplacian regularized Least Squares (LapRLS). In particular, we derive a deterministic out-of-sample error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound. Since the minimization is a combinational problem, we relax it into continuous domain and solve it by projected gradient descent. Experiments on benchmark datasets show that the proposed method outperforms the state-of-the-art methods.

artificial intelligence, laprls, machine learning, (17 more...)

Country:

North America > United States > Illinois (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Texas (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Neural Information Processing SystemsJan-20-2025, 03:37:48 GMT

Selective Labeling via Error Bound Minimization

In many practical machine learning problems, the acquisition of labeled data is often expensive and/or time consuming. This motivates us to study a problem as follows: given a label budget, how to select data points to label such that the learning performance is optimized. We propose a selective labeling method by analyzing the generalization error of Laplacian regularized Least Squares (LapRLS). In particular, we derive a deterministic generalization error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound. Since the minimization is a combinational problem, we relax it into continuous domain and solve it by projected gradient descent.

error bound minimization, generalization error, selective, (1 more...)

Industry: Education > Focused Education > Special Education (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsOct-7-2024, 03:26:53 GMT

Selective Labeling via Error Bound Minimization

Quanquan Gu, Tong Zhang, Jiawei Han, Chris H. Ding

laprls, learning, out-of-sample error, (16 more...)

Country:

North America > United States > Illinois (0.04)
North America > United States > Wisconsin (0.04)
North America > United States > Texas (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry: Government > Military (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Artificial IntelligenceMay-14-2024

Falcon 7b for Software Mention Detection in Scholarly Documents

Khan, AmeerAli, Ramadan, Qusai, Yang, Cong, Boukhers, Zeyd

This paper aims to tackle the challenge posed by the increasing integration of software tools in research across various disciplines by investigating the application of Falcon-7b for the detection and classification of software mentions within scholarly texts. Specifically, the study focuses on solving Subtask I of the Software Mention Detection in Scholarly Publications (SOMD), which entails identifying and categorizing software mentions from academic literature. Through comprehensive experimentation, the paper explores different training strategies, including a dual-classifier approach, adaptive sampling, and weighted loss scaling, to enhance detection accuracy while overcoming the complexities of class imbalance and the nuanced syntax of scholarly writing. The findings highlight the benefits of selective labelling and adaptive sampling in improving the model's performance. However, they also indicate that integrating multiple strategies does not necessarily result in cumulative improvements. This research offers insights into the effective application of large language models for specific tasks such as SOMD, underlining the importance of tailored approaches to address the unique challenges presented by academic text analysis.

application, entity recognition, software mention detection, (11 more...)

2405.08514

Country:

Oceania > Australia > Queensland (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
(2 more...)

Genre:

Overview (0.93)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceAug-9-2023

SLPT: Selective Labeling Meets Prompt Tuning on Label-Limited Lesion Segmentation

Bai, Fan, Yan, Ke, Bai, Xiaoyu, Mao, Xinyu, Yin, Xiaoli, Zhou, Jingren, Shi, Yu, Lu, Le, Meng, Max Q. -H.

Medical image analysis using deep learning is often challenged by limited labeled data and high annotation costs. Fine-tuning the entire network in label-limited scenarios can lead to overfitting and suboptimal performance. Recently, prompt tuning has emerged as a more promising technique that introduces a few additional tunable parameters as prompts to a task-agnostic pre-trained model, and updates only these parameters using supervision from limited labeled data while keeping the pre-trained model unchanged. However, previous work has overlooked the importance of selective labeling in downstream tasks, which aims to select the most valuable downstream samples for annotation to achieve the best performance with minimum annotation cost. To address this, we propose a framework that combines selective labeling with prompt tuning (SLPT) to boost performance in limited labels. Specifically, we introduce a feature-aware prompt updater to guide prompt tuning and a TandEm Selective LAbeling (TESLA) strategy. TESLA includes unsupervised diversity selection and supervised selection using prompt-based uncertainty. In addition, we propose a diversified visual prompt tuning strategy to provide multi-prompt-based discrepant predictions for TESLA. We evaluate our method on liver tumor segmentation and achieve state-of-the-art performance, outperforming traditional fine-tuning with only 6% of tunable parameters, also achieving 94% of full-data performance by labeling only 5% of the data.

artificial intelligence, machine learning, pre-trained model, (16 more...)

2308.04911

Country:

Asia > China > Hong Kong (0.05)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
North America > Canada > Quebec > Capitale-Nationale Region > Quebec City (0.04)
(5 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Berrevoets, Jeroen, Imrie, Fergus, Kyono, Trent, Jordon, James, van der Schaar, Mihaela

To Impute or not to Impute? -- Missing Data in Treatment Effect Estimation

arXiv.org Machine LearningFeb-4-2022

Missing data is a systemic problem in practical scenarios that causes noise and bias when estimating treatment effects. This makes treatment effect estimation from data with missingness a particularly tricky endeavour. A key reason for this is that standard assumptions on missingness are rendered insufficient due to the presence of an additional variable, treatment, besides the individual and the outcome. Having a treatment variable introduces additional complexity with respect to why some variables are missing that is not fully explored by previous work. In our work we identify a new missingness mechanism, which we term mixed confounded missingness (MCM), where some missingness determines treatment selection and other missingness is determined by treatment selection. Given MCM, we show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates. However, no imputation at all also leads to biased estimates, as missingness determined by treatment divides the population in distinct subpopulations, where estimates across these populations will be biased. Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not. We empirically demonstrate how various learners benefit from selective imputation compared to other solutions for missing data.

imputation, missingness, treatment effect, (16 more...)

arXiv.org Machine Learning

2202.02096

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
(2 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-14-2020, 21:41:55 GMT

Selective Labeling via Error Bound Minimization

Gu, Quanquan, Zhang, Tong, Han, Jiawei, Ding, Chris H.

In many practical machine learning problems, the acquisition of labeled data is often expensive and/or time consuming. This motivates us to study a problem as follows: given a label budget, how to select data points to label such that the learning performance is optimized. We propose a selective labeling method by analyzing the generalization error of Laplacian regularized Least Squares (LapRLS). In particular, we derive a deterministic generalization error bound for LapRLS trained on subsampled data, and propose to select a subset of data points to label by minimizing this upper bound. Since the minimization is a combinational problem, we relax it into continuous domain and solve it by projected gradient descent.

error bound minimization, generalization error, selective, (1 more...)

Industry: Education > Focused Education > Special Education (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

WIREDJun-20-2018, 18:00:20 GMT

Trump Stokes Outrage in Silicon Valley--But It's Selective

Silicon Valley is in the middle of an awakening, the dawning but selective realization that their products can be used to achieve terrible ends. In the past few months, this growing unease has bubbled up into outright rebellion from within the rank and file of some of the largest companies in the Valley, beginning in April when Google employees balked at the company's involvement with a Pentagon artificial intelligence program called Project Maven. On Monday, Amazon shareholders sent an open letter asking CEO Jeff Bezos to halt a program developing facial recognition software for governments pending a review by the board of directors. Also this week, as general horror built up over the Trump administration's new "zero tolerance" immigration policy, which has led to the separation of more than 2,000 children from their parents, Microsoft employees objected to their company's contract with US Immigration and Customs Enforcement to use Microsoft's Azure cloud services. "We are part of a growing movement, comprised of many across the industry who recognize the grave responsibility that those creating powerful technology have to ensure what they build is used for good, and not for harm," reads an open letter posted to the company's internal message board Tuesday.

artificial intelligence, silicon valley, social media, (15 more...)

WIRED

Country:

North America > United States > California (0.64)
Europe > Russia (0.05)
Europe > Germany (0.05)
(2 more...)

Industry:

Information Technology > Services (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Immigration & Customs (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (0.90)