AITopics

2204.0715

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)
North America > Canada > Ontario > Thunder Bay (0.14)
Europe > Estonia > Harju County > Tallinn (0.05)
(3 more...)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Chatzimparmpas, Angelos, Paulovich, Fernando V., Kerren, Andreas

HardVis: Visual Analytics to Handle Instance Hardness Using Undersampling and Oversampling Techniques

Despite the tremendous advances in machine learning (ML), training with imbalanced data still poses challenges in many real-world applications. Among a series of diverse techniques to solve this problem, sampling algorithms are regarded as an efficient solution. However, the problem is more fundamental, with many works emphasizing the importance of instance hardness. This issue refers to the significance of managing unsafe or potentially noisy instances that are more likely to be misclassified and serve as the root cause of poor classification performance. This paper introduces HardVis, a visual analytics system designed to handle instance hardness mainly in imbalanced classification scenarios. Our proposed system assists users in visually comparing different distributions of data types, selecting types of instances based on local characteristics that will later be affected by the active sampling method, and validating which suggestions from undersampling or oversampling techniques are beneficial for the ML model. Additionally, rather than uniformly undersampling/oversampling a specific class, we allow users to find and sample easy and difficult to classify training instances from all classes. Users can explore subsets of data from different perspectives to decide all those parameters, while HardVis keeps track of their steps and evaluates the model's predictive performance in a test set separately. The end result is a well-balanced data set that boosts the predictive power of the ML model. The efficacy and effectiveness of HardVis are demonstrated with a hypothetical usage scenario and a use case. Finally, we also look at how useful our system is based on feedback we received from ML experts.

artificial intelligence, inductive learning, machine learning, (14 more...)

doi: 10.1111/cgf.14726

2203.15753

Country:

Europe > Sweden > Östergötland County > Linköping (0.04)
Europe > Netherlands > North Brabant > Eindhoven (0.04)

Genre:

Research Report (1.00)
Workflow (0.69)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)

Learning Representations for New Sound Classes With Continual Self-Supervised Learning

Wang, Zhepei, Subakan, Cem, Jiang, Xilin, Wu, Junkai, Tzinis, Efthymios, Ravanelli, Mirco, Smaragdis, Paris

In this paper, we work on a sound recognition system that continually incorporates new sound classes. Our main goal is to develop a framework where the model can be updated without relying on labeled data. For this purpose, we propose adopting representation learning, where an encoder is trained using unlabeled data. This learning framework enables the study and implementation of a practically relevant use case where only a small amount of the labels is available in a continual learning context. We also make the empirical observation that a similarity-based representation learning method within this framework is robust to forgetting even if no explicit mechanism against forgetting is employed. We show that this approach obtains similar performance compared to several distillation-based continual learning methods when employed on self-supervised representation learning methods.

artificial intelligence, machine learning, representation, (16 more...)

doi: 10.1109/LSP.2022.3229643

2205.0739

Country:

North America > United States > Illinois (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.42)

Bitew, Semere Kiros, Hadifar, Amir, Sterckx, Lucas, Deleu, Johannes, Develder, Chris, Demeester, Thomas

Learning to Reuse Distractors to support Multiple Choice Question Generation in Education

Multiple choice questions (MCQs) are widely used in digital learning systems, as they allow for automating the assessment process. However, due to the increased digital literacy of students and the advent of social media platforms, MCQ tests are widely shared online, and teachers are continuously challenged to create new questions, which is an expensive and time-consuming task. A particularly sensitive aspect of MCQ creation is to devise relevant distractors, i.e., wrong answers that are not easily identifiable as being wrong. This paper studies how a large existing set of manually created answers and distractors for questions over a variety of domains, subjects, and languages can be leveraged to help teachers in creating new MCQs, by the smart reuse of existing distractors. We built several data-driven models based on context-aware question and distractor representations, and compared them with static feature-based models. The proposed models are evaluated with automated metrics and in a realistic user test with teachers. Both automatic and human evaluations indicate that context-aware models consistently outperform a static feature-based approach. For our best-performing context-aware model, on average 3 distractors out of the 10 shown to teachers were rated as high-quality distractors. We create a performance benchmark, and make it public, to enable comparison between different approaches and to introduce a more standardized evaluation of the task. The benchmark contains a test of 298 educational questions covering multiple subjects & languages and a 77k multilingual pool of distractor vocabulary for future research.

artificial intelligence, machine learning, natural language, (20 more...)

doi: 10.1109/TLT.2022.3226523

2210.13964

Country:

Europe > Sweden > Östergötland County > Linköping (0.04)
Africa > Ethiopia (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(3 more...)

Harutyunyan, Hrayr, Steeg, Greg Ver, Galstyan, Aram

Formal limitations of sample-wise information-theoretic generalization bounds

Some of the tightest information-theoretic generalization bounds depend on the average information between the learned hypothesis and a single training example. However, these sample-wise bounds were derived only for expected generalization gap. We show that even for expected squared generalization gap no such sample-wise information-theoretic bounds exist. The same is true for PAC-Bayes and single-draw bounds. Remarkably, PAC-Bayes, single-draw and expected squared generalization gap bounds that depend on information in pairs of examples exist.

artificial intelligence, generalization, machine learning, (17 more...)

doi: 10.1109/ITW54588.2022.9965850

2205.06915

Country: Oceania > Australia > New South Wales > Sydney (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)

A Sandbox Tool to Bias(Stress)-Test Fairness Algorithms

Akpinar, Nil-Jana, Nagireddy, Manish, Stapleton, Logan, Cheng, Hao-Fei, Zhu, Haiyi, Wu, Steven, Heidari, Hoda

Motivated by the growing importance of reducing unfairness in ML predictions, Fair-ML researchers have presented an extensive suite of algorithmic 'fairness-enhancing' remedies. Most existing algorithms, however, are agnostic to the sources of the observed unfairness. As a result, the literature currently lacks guiding frameworks to specify conditions under which each algorithmic intervention can potentially alleviate the underpinning cause of unfairness. To close this gap, we scrutinize the underlying biases (e.g., in the training data or design choices) that cause observational unfairness. We present the conceptual idea and a first implementation of a bias-injection sandbox tool to investigate fairness consequences of various biases and assess the effectiveness of algorithmic remedies in the presence of specific types of bias. We call this process the bias(stress)-testing of algorithmic interventions. Unlike existing toolkits, ours provides a controlled environment to counterfactually inject biases in the ML pipeline. This stylized setup offers the distinct capability of testing fairness interventions beyond observational data and against an unbiased benchmark. In particular, we can test whether a given remedy can alleviate the injected bias by comparing the predictions resulting after the intervention in the biased setting with true labels in the unbiased regime-that is, before any bias injection. We illustrate the utility of our toolkit via a proof-of-concept case study on synthetic data. Our empirical analysis showcases the type of insights that can be obtained through our simulations.

data mining, intervention, machine learning, (18 more...)

2204.10233

Country:

North America > United States > Minnesota (0.04)
North America > United States > Massachusetts (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Education (1.00)

Technology:

Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Dubois, Yann, Hashimoto, Tatsunori, Ermon, Stefano, Liang, Percy

Improving Self-Supervised Learning by Characterizing Idealized Representations

arXiv.org Artificial IntelligenceDec-12-2022

Despite the empirical successes of self-supervised learning (SSL) methods, it is unclear what characteristics of their representations lead to high downstream accuracies. In this work, we characterize properties that SSL representations should ideally satisfy. Specifically, we prove necessary and sufficient conditions such that for any task invariant to given data augmentations, desired probes (e.g., linear or MLP) trained on that representation attain perfect accuracy. These requirements lead to a unifying conceptual framework for improving existing SSL methods and deriving new ones. For contrastive learning, our framework prescribes simple but significant improvements to previous methods such as using asymmetric projection heads. For non-contrastive learning, we use our framework to derive a simple and novel objective. Our resulting SSL algorithms outperform baselines on standard benchmarks, including SwAV+multicrops on linear probing of ImageNet.

artificial intelligence, machine learning, representation, (17 more...)

2209.06235

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
(2 more...)

Zheng, Hao, Lapata, Mirella

Real-World Compositional Generalization with Disentangled Sequence-to-Sequence Learning

arXiv.org Artificial IntelligenceDec-12-2022

Compositional generalization is a basic mechanism in human language learning, which current neural networks struggle with. A recently proposed Disentangled sequence-to-sequence model (Dangle) shows promising generalization capability by learning specialized encodings for each decoding step. We introduce two key modifications to this model which encourage more disentangled representations and improve its compute and memory efficiency, allowing us to tackle compositional generalization in a more realistic setting. Specifically, instead of adaptively re-encoding source keys and values at each time step, we disentangle their representations and only re-encode keys periodically, at some interval. Our new architecture leads to better generalization performance across existing tasks and datasets, and a new machine translation benchmark which we create by detecting naturally occurring compositional patterns in relation to a training set. We show this methodology better emulates real-world requirements than artificial challenges.

computational linguistic, machine learning, natural language, (18 more...)

2212.05982

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(12 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

#artificialintelligenceDec-11-2022, 07:55:30 GMT

10 Metrics to Evaluate Supervised Machine Learning Models

Supervised machine learning algorithms try to model the relationship between features (independent variables) and a label (target variable) given a set of observations. The two main types of supervised learning tasks are classification and regression. Classification is a supervised learning technique in which the target variable is discrete (or categorical). Consider a bank collecting a dataset that contains several features about their customers such as the number of products, transactions within the last month, expected monthly income, and so on. The bank is trying to predict whether a customer will churn based on this dataset.

artificial intelligence, machine learning, prediction, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.76)

arXiv.org Artificial IntelligenceDec-11-2022

Toward Robust Graph Semi-Supervised Learning against Extreme Data Scarcity

Ding, Kaize, Nouri, Elnaz, Zheng, Guoqing, Liu, Huan, White, Ryen

The success of graph neural networks on graph-based web mining highly relies on abundant human-annotated data, which is laborious to obtain in practice. When only few labeled nodes are available, how to improve their robustness is a key to achieve replicable and sustainable graph semi-supervised learning. Though self-training has been shown to be powerful for semi-supervised learning, its application on graph-structured data may fail because (1) larger receptive fields are not leveraged to capture long-range node interactions, which exacerbates the difficulty of propagating feature-label patterns from labeled nodes to unlabeled nodes; and (2) limited labeled data makes it challenging to learn well-separated decision boundaries for different node classes without explicitly capturing the underlying semantic structure. To address the challenges of capturing informative structural and semantic knowledge, we propose a new graph data augmentation framework, AGST (Augmented Graph Self-Training), which is built with two new (i.e., structural and semantic) augmentation modules on top of a decoupled GST backbone. In this work, we investigate whether this novel framework can learn a robust graph predictive model under the low-data context. We conduct comprehensive evaluations on semi-supervised node classification under different scenarios of limited labeled-node data. The experimental results demonstrate the unique contributions of the novel data augmentation framework for node classification with few labeled data.

artificial intelligence, machine learning, node, (19 more...)

2208.12422

Country:

North America > United States > Arizona (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.48)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)