AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Neural Information Processing SystemsOct-8-2024, 16:29:29 GMT

Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss

preconditioned stochastic gradient descent, uncertainty sampling, zero-one loss

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.69)

Neural Information Processing SystemsOct-7-2024, 10:57:12 GMT

Reviews: Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss

This paper provides theoretical analysis and empirical examples for two phenomenon in active learning. The first is it could be possible that the 0-1 loss on subset of the entire dataset generated uncertainty sampling is smaller than learning using the whole dataset. The second is uncertainty sampling could "converge" to different models and predictive results. In the analysis, it is shown that the reason for these is the expected gradient of the "surrogate" loss of the most uncertain point is in the direction of the gradient of the current 0-1 loss. This result is based on the setup that the most uncertain point is sampled from a minipool that is a subset sampled without replacement randomly from the entire set.

preconditioned stochastic gradient descent, uncertain point, uncertainty sampling, (9 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

arXiv.org Artificial IntelligenceJun-6-2024

Provably Neural Active Learning Succeeds via Prioritizing Perplexing Samples

Bu, Dake, Huang, Wei, Suzuki, Taiji, Cheng, Ji, Zhang, Qingfu, Xu, Zhiqiang, Wong, Hau-San

Neural Network-based active learning (NAL) is a cost-effective data selection technique that utilizes neural networks to select and train on a small subset of samples. While existing work successfully develops various effective or theory-justified NAL algorithms, the understanding of the two commonly used query criteria of NAL: uncertainty-based and diversity-based, remains in its infancy. In this work, we try to move one step forward by offering a unified explanation for the success of both query criteria-based NAL from a feature learning view. Specifically, we consider a feature-noise data model comprising easy-to-learn or hard-to-learn features disrupted by noise, and conduct analysis over 2-layer NN-based NALs in the pool-based scenario. We provably show that both uncertainty-based and diversity-based NAL are inherently amenable to one and the same principle, i.e., striving to prioritize samples that contain yet-to-be-learned features. We further prove that this shared principle is the key to their success-achieve small test error within a small labeled set. Contrastingly, the strategy-free passive learning exhibits a large test error due to the inadequate learning of yet-to-be-learned features, necessitating resort to a significantly larger label complexity for a sufficient test error reduction. Experimental results validate our findings.

algorithm, probability, provably neural active learning succeed, (9 more...)

2406.03944

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Europe > Austria > Vienna (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Chong, Zan-Kai, Ohsaki, Hiroyuki, Goi, Bok-Min

Improving Uncertainty Sampling with Bell Curve Weight Function

arXiv.org Artificial IntelligenceMar-2-2024

Typically, a supervised learning model is trained using passive learning by randomly selecting unlabelled instances to annotate. This approach is effective for learning a model, but can be costly in cases where acquiring labelled instances is expensive. For example, it can be time-consuming to manually identify spam mails (labelled instances) from thousands of emails (unlabelled instances) flooding an inbox during initial data collection. Generally, we answer the above scenario with uncertainty sampling, an active learning method that improves the efficiency of supervised learning by using fewer labelled instances than passive learning. Given an unlabelled data pool, uncertainty sampling queries the labels of instances where the predicted probabilities, p, fall into the uncertainty region, i.e., $p \approx 0.5$. The newly acquired labels are then added to the existing labelled data pool to learn a new model. Nonetheless, the performance of uncertainty sampling is susceptible to the area of unpredictable responses (AUR) and the nature of the dataset. It is difficult to determine whether to use passive learning or uncertainty sampling without prior knowledge of a new dataset. To address this issue, we propose bell curve sampling, which employs a bell curve weight function to acquire new labels. With the bell curve centred at p=0.5, bell curve sampling selects instances whose predicted values are in the uncertainty area most of the time without neglecting the rest. Simulation results show that, most of the time bell curve sampling outperforms uncertainty sampling and passive learning in datasets of different natures and with AUR.

bell curve, dataset, learning, (15 more...)

doi: 10.17706/ijapm.2023.13.4.44-52

2403.01352

Country:

Asia > Malaysia (0.05)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Japan (0.04)

Genre: Research Report (0.84)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Yang, Chenhongyi, Huang, Lichao, Crowley, Elliot J.

Plug and Play Active Learning for Object Detection

arXiv.org Artificial IntelligenceNov-21-2022

Annotating data for supervised learning is expensive and tedious, and we want to do as little of it as possible. To make the most of a given "annotation budget" we can turn to active learning (AL) which aims to identify the most informative samples in a dataset for annotation. Active learning algorithms are typically uncertainty-based or diversity-based. Both have seen success in image classification, but fall short when it comes to object detection. We hypothesise that this is because: (1) it is difficult to quantify uncertainty for object detection as it consists of both localisation and classification, where some classes are harder to localise, and others are harder to classify; (2) it is difficult to measure similarities for diversity-based AL when images contain different numbers of objects. We propose a two-stage active learning algorithm Plug and Play Active Learning (PPAL) that overcomes these difficulties. It consists of (1) Difficulty Calibrated Uncertainty Sampling, in which we used a category-wise difficulty coefficient that takes both classification and localisation into account to re-weight object uncertainties for uncertainty-based sampling; (2) Category Conditioned Matching Similarity to compute the similarities of multi-instance images as ensembles of their instance similarities. PPAL is highly generalisable because it makes no change to model architectures or detector training pipelines. We benchmark PPAL on the MS-COCO and Pascal VOC datasets using different detector architectures and show that our method outperforms the prior state-of-the-art. Code is available at https://github.com/ChenhongyiYang/PPAL

artificial intelligence, machine learning, object-oriented architecture, (15 more...)

2211.11612

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Object-Oriented Architecture (0.49)

Mondal, Ishani, Ganguly, Debasis

ALEX: Active Learning based Enhancement of a Model's Explainability

arXiv.org Artificial IntelligenceSep-2-2020

An active learning (AL) algorithm seeks to construct an effective classifier with a minimal number of labeled examples in a bootstrapping manner. While standard AL heuristics, such as selecting those points for annotation for which a classification model yields least confident predictions, there has been no empirical investigation to see if these heuristics lead to models that are more interpretable to humans. In the era of data-driven learning, this is an important research direction to pursue. This paper describes our work-in-progress towards developing an AL selection function that in addition to model effectiveness also seeks to improve on the interpretability of a model during the bootstrapping steps. Concretely speaking, our proposed selection function trains an `explainer' model in addition to the classifier model, and favours those instances where a different part of the data is used, on an average, to explain the predicted class. Initial experiments exhibited encouraging trends in showing that such a heuristic can lead to developing more effective and more explainable end-to-end data-driven classifiers.

artificial intelligence, machine learning, uncertainty sampling, (15 more...)

2009.00859

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > India > West Bengal > Kharagpur (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Mussmann, Stephen, Liang, Percy S.

Uncertainty Sampling is Preconditioned Stochastic Gradient Descent on Zero-One Loss

Neural Information Processing SystemsFeb-14-2020, 19:11:51 GMT

preconditioned stochastic gradient descent, uncertainty sampling, zero-one loss

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.69)

#artificialintelligenceOct-24-2019, 06:13:25 GMT

Knowledge Quadrant for Machine Learning

Most Machine Learning systems that are deployed in the world today learn from human feedback. For example, a self-driving car can understand a stop sign because humans have manually labeled 1,000s of examples of stop signs in videos taken from cars. Those labeled examples are what teaches the algorithms deployed in the cars to automatically identify the stop signs. However, most Machine Learning courses focus almost exclusively on the algorithms, not the Human-Computer Interaction part of the systems. This can leave a big knowledge gap for Data Scientists working in real-world Machine Learning, where they will spend more time on data management than on building algorithms.

learning, machine learning, sampling, (16 more...)

#artificialintelligence

Industry: Education (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.55)

Mussmann, Stephen, Liang, Percy

On the Relationship between Data Efficiency and Error for Uncertainty Sampling

arXiv.org Machine LearningJun-15-2018

While active learning offers potential cost savings, the actual data efficiency---the reduction in amount of labeled data needed to obtain the same error rate---observed in practice is mixed. This paper poses a basic question: when is active learning actually helpful? We provide an answer for logistic regression with the popular active learning algorithm, uncertainty sampling. Empirically, on 21 datasets from OpenML, we find a strong inverse correlation between data efficiency and the error rate of the final classifier. Theoretically, we show that for a variant of uncertainty sampling, the asymptotic data efficiency is within a constant factor of the inverse error rate of the limiting classifier.

artificial intelligence, data efficiency and error, machine learning, (11 more...)

arXiv.org Machine Learning

1806.06123

Country:

North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)