AITopics

2303.00633

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

arXiv.org Artificial IntelligenceMar-5-2023

Robustness, Evaluation and Adaptation of Machine Learning Models in the Wild

Piratla, Vihari

Our goal is to improve reliability of Machine Learning (ML) systems deployed in the wild. ML models perform exceedingly well when test examples are similar to train examples. However, real-world applications are required to perform on any distribution of test examples. Current ML systems can fail silently on test examples with distribution shifts. In order to improve reliability of ML models due to covariate or domain shift, we propose algorithms that enable models to: (a) generalize to a larger family of test distributions, (b) evaluate accuracy under distribution shifts, (c) adapt to a target distribution. We study causes of impaired robustness to domain shifts and present algorithms for training domain robust models. A key source of model brittleness is due to domain overfitting, which our new training algorithms suppress and instead encourage domain-general hypotheses. While we improve robustness over standard training methods for certain problem settings, performance of ML systems can still vary drastically with domain shifts. It is crucial for developers and stakeholders to understand model vulnerabilities and operational ranges of input, which could be assessed on the fly during the deployment, albeit at a great cost. Instead, we advocate for proactively estimating accuracy surfaces over any combination of prespecified and interpretable domain shifts for performance forecasting. We present a label-efficient estimation to address estimation over a combinatorial space of domain shifts. Further, when a model's performance on a target domain is found to be poor, traditional approaches adapt the model using the target domain's resources. Standard adaptation methods assume access to sufficient labeled resources, which may be impractical for deployed models. We initiate a study of lightweight adaptation techniques with only unlabeled data resources with a focus on language applications.

artificial intelligence, large language model, natural language, (20 more...)

2303.02781

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Africa (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Education (0.92)
Health & Medicine > Diagnostic Medicine > Imaging (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(3 more...)

arXiv.org Artificial IntelligenceMar-5-2023

CoRTX: Contrastive Framework for Real-time Explanation

Chuang, Yu-Neng, Wang, Guanchu, Yang, Fan, Zhou, Quan, Tripathi, Pushkar, Cai, Xuanting, Hu, Xia

Recent advancements in explainable machine learning provide effective and faithful solutions for interpreting model behaviors. However, many explanation methods encounter efficiency issues, which largely limit their deployments in practical scenarios. Real-time explainer (RTX) frameworks have thus been proposed to accelerate the model explanation process by learning a one-feed-forward explainer. Existing RTX frameworks typically build the explainer under the supervised learning paradigm, which requires large amounts of explanation labels as the ground truth. Considering that accurate explanation labels are usually hard to obtain due to constrained computational resources and limited human efforts, effective explainer training is still challenging in practice. In this work, we propose a COntrastive Real-Time eXplanation (CoRTX) framework to learn the explanation-oriented representation and relieve the intensive dependence of explainer training on explanation labels. Specifically, we design a synthetic strategy to select positive and negative instances for the learning of explanation. Theoretical analysis show that our selection strategy can benefit the contrastive learning process on explanation tasks. Experimental results on three real-world datasets further demonstrate the efficiency and efficacy of our proposed CoRTX framework.

artificial intelligence, explanation, machine learning, (19 more...)

2303.02794

Country: Europe (0.14)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.66)

Sharoni, Michal, Sabato, Sivan

On the Capacity Limits of Privileged ERM

arXiv.org Machine LearningMar-5-2023

We study the supervised learning paradigm called Learning Using Privileged Information, first suggested by Vapnik and Vashist (2009). In this paradigm, in addition to the examples and labels, additional (privileged) information is provided only for training examples. The goal is to use this information to improve the classification accuracy of the resulting classifier, where this classifier can only use the non-privileged information of new example instances to predict their label. We study the theory of privileged learning with the zero-one loss under the natural Privileged ERM algorithm proposed in Pechyony and Vapnik (2010a). We provide a counter example to a claim made in that work regarding the VC dimension of the loss class induced by this problem; We conclude that the claim is incorrect. We then provide a correct VC dimension analysis which gives both lower and upper bounds on the capacity of the Privileged ERM loss class. We further show, via a generalization analysis, that worst-case guarantees for Privileged ERM cannot improve over standard non-privileged ERM, unless the capacity of the privileged information is similar or smaller to that of the non-privileged information. This result points to an important limitation of the Privileged ERM approach. In our closing discussion, we suggest another way in which Privileged ERM might still be helpful, even when the capacity of the privileged information is large.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Machine Learning

2303.02658

Country:

Asia > Middle East > Israel > Southern District > Beer-Sheva (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

#artificialintelligenceMar-4-2023, 05:40:10 GMT

Supervised learning…. Introduction and Explanation

Supervised learning is a type of machine learning where an algorithm is trained on a labeled dataset. In supervised learning, the algorithm is provided with input-output pairs, and it learns to predict the output for new inputs. The goal of supervised learning is to train the algorithm to generalize to new inputs and outputs beyond the training data. Supervised learning is often used in applications where there is a well-defined output variable, such as predicting stock prices, diagnosing diseases, or recognizing objects in images. It is an effective technique for solving a wide range of problems, including regression and classification tasks.

algorithm, learning, supervised learning, (12 more...)

#artificialintelligence

Industry: Banking & Finance (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

arXiv.org Artificial IntelligenceMar-4-2023

Federated Semi-Supervised Learning with Annotation Heterogeneity

Shang, Xinyi, Huang, Gang, Lu, Yang, Lou, Jian, Han, Bo, Cheung, Yiu-ming, Wang, Hanzi

Federated Semi-Supervised Learning (FSSL) aims to learn a global model from different clients in an environment with both labeled and unlabeled data. Most of the existing FSSL work generally assumes that both types of data are available on each client. In this paper, we study a more general problem setup of FSSL with annotation heterogeneity, where each client can hold an arbitrary percentage (0%-100%) of labeled data. To this end, we propose a novel FSSL framework called Heterogeneously Annotated Semi-Supervised LEarning (HASSLE). Specifically, it is a dual-model framework with two models trained separately on labeled and unlabeled data such that it can be simply applied to a client with an arbitrary labeling percentage. Furthermore, a mutual learning strategy called Supervised-Unsupervised Mutual Alignment (SUMA) is proposed for the dual models within HASSLE with global residual alignment and model proximity alignment. Subsequently, the dual models can implicitly learn from both types of data across different clients, although each dual model is only trained locally on a single type of data. Experiments verify that the dual models in HASSLE learned by SUMA can mutually learn from each other, thereby effectively utilizing the information of both types of data across different clients.

artificial intelligence, inductive learning, machine learning, (17 more...)

2303.02445

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Virginia (0.04)
Europe > Italy (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)

Karyono, Kanisius, Abdullah, Badr M., Cotgrave, Alison J., Bras, Ana, Cullen, Jeff

Developing the Reliable Shallow Supervised Learning for Thermal Comfort using ASHRAE RP-884 and ASHRAE Global Thermal Comfort Database II

This work has been submitted to the IEEE for possible publication in the IEEE Transaction on Pattern Analysis and Machine Intelligence (T-PAMI) on 7 January 2022. Abstract--The artificial intelligence (AI) system designer for thermal comfort faces insufficient data recorded from the current user or overfitting due to unreliable training data. This work introduces the reliable data set for training the AI subsystem for thermal comfort. This paper presents the control algorithm based on shallow supervised learning, which is simple enough to be implemented in the Internet of Things (IoT) system for residential usage using ASHRAE RP-884 and ASHRAE Global Thermal Comfort Database II. No training data for thermal comfort is available as reliable as this dataset, but the direct use of this data can lead to overfitting. This work offers the algorithm for data filtering and semantic data augmentation for the ASHRAE database for the supervised learning process. Overfitting always becomes a problem due to the psychological aspect involved in the thermal comfort decision. The method to check the AI system based on the psychrometric chart against overfitting is presented. This paper also assesses the most important parameters needed to achieve human thermal comfort. This method can support the development of reinforced learning for thermal comfort. HE decarbonising heat and buildings has become one heat pump is not a drop-in replacement for gas-boilers [5]. The UK is committed to If the heat pump is installed in poorly performed or leaky reaching net-zero emissions by 2050 [1]. The support includes buildings, the efficiency will decrease.

artificial intelligence, database, machine learning, (14 more...)

2303.03873

Country:

Europe > United Kingdom > England > Greater London > London (0.14)
Europe > United Kingdom > England > Merseyside > Liverpool (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry:

Energy > Renewable (1.00)
Construction & Engineering (1.00)
Information Technology > Smart Houses & Appliances (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.80)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Single-photon Image Super-resolution via Self-supervised Learning

Chen, Yiwei, Jiang, Chen, Pan, Yu

Single-Photon Image Super-Resolution (SPISR) aims to recover a high-resolution volumetric photon counting cube from a noisy low-resolution one by computational imaging algorithms. In real-world scenarios, pairs of training samples are often expensive or impossible to obtain. By extending Equivariant Imaging (EI) to volumetric single-photon data, we propose a self-supervised learning framework for the SPISR task. Particularly, using the Poisson unbiased Kullback-Leibler risk estimator and equivariance, our method is able to learn from noisy measurements without ground truths. Comprehensive experiments on simulated and real-world dataset demonstrate that the proposed method achieves comparable performance with supervised learning and outperforms interpolation-based methods.

artificial intelligence, imaging, machine learning, (16 more...)

2303.02033

Country: Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.83)

Bordes, Florian, Balestriero, Randall, Vincent, Pascal

Towards Democratizing Joint-Embedding Self-Supervised Learning

Joint Embedding Self-Supervised Learning (JE-SSL) has seen rapid developments in recent years, due to its promise to effectively leverage large unlabeled data. The development of JE-SSL methods was driven primarily by the search for ever increasing downstream classification accuracies, using huge computational resources, and typically built upon insights and intuitions inherited from a close parent JE-SSL method. This has led unwittingly to numerous pre-conceived ideas that carried over across methods e.g. that SimCLR requires very large mini batches to yield competitive accuracies; that strong and computationally slow data augmentations are required. In this work, we debunk several such ill-formed a priori ideas in the hope to unleash the full potential of JE-SSL free of unnecessary limitations. In fact, when carefully evaluating performances across different downstream tasks and properly optimizing hyper-parameters of the methods, we most often -- if not always -- see that these widespread misconceptions do not hold. For example we show that it is possible to train SimCLR to learn useful representations, while using a single image patch as negative example, and simple Gaussian noise as the only data augmentation for the positive pair. Along these lines, in the hope to democratize JE-SSL and to allow researchers to easily make more extensive evaluations of their methods, we introduce an optimized PyTorch library for SSL.

artificial intelligence, batch size, machine learning, (18 more...)

2303.01986

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Schmutz, Hugo, Humbert, Olivier, Mattei, Pierre-Alexandre

Don't fear the unlabelled: safe semi-supervised learning via simple debiasing

Semi-supervised learning (SSL) provides an effective means of leveraging unlabelled data to improve a model's performance. Even though the domain has received a considerable amount of attention in the past years, most methods present the common drawback of lacking theoretical guarantees. Our starting point is to notice that the estimate of the risk that most discriminative SSL methods minimise is biased, even asymptotically. This bias impedes the use of standard statistical learning theory and can hurt empirical performance. We propose a simple way of removing the bias. Our debiasing approach is straightforward to implement and applicable to most deep SSL methods. We provide simple theoretical guarantees on the trustworthiness of these modified methods, without having to rely on the strong assumptions on the data distribution that SSL theory usually requires. In particular, we provide generalisation error bounds for the proposed methods. We evaluate debiased versions of different existing SSL methods, such as the Pseudolabel method and Fixmatch, and show that debiasing can compete with classic deep SSL techniques in various settings by providing better calibrated models. Additionally, we provide a theoretical explanation of the intuition of the popular SSL methods. The promise of semi-supervised learning (SSL) is to be able to learn powerful predictive models using partially labelled data. In turn, this would allow machine learning to be less dependent on the often costly and sometimes dangerously biased task of labelling data. Scudder's (1965) untaught pattern recognition machine--simply replaced unknown labels with predictions made by some estimate of the predictive model and used the obtained pseudo-labels to refine their initial estimate. Other more complex branches of SSL have been explored since, notably using generative models (from McLachlan, 1977, to Kingma et al., 2014) or graphs (notably following Zhu et al., 2003). Deep neural networks, which are state-of-the-art supervised predictors, have been trained successfully using SSL. Somewhat surprisingly, the main ingredient of their success is still the notion of pseudo-labels (or one of its variants), combined with systematic use of data augmentation (e.g. An obvious SSL baseline is simply throwing away the unlabelled data.

artificial intelligence, inductive learning, machine learning, (18 more...)

2203.07512

Country:

Europe > France > Provence-Alpes-Côte d'Azur > Alpes-Maritimes > Nice (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.67)

Industry: Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)