AITopics | Inductive Learning

Collaborating Authors

Inductive Learning

Inductive learning, or induction, is the process of creating generalizations from individual instances.

News Overviews Instructional Materials AI-Alerts Classics

Learning to Write with Coherence From Negative Examples

Son, Seonil, Lim, Jaeseo, Jang, Youwon, Lee, Jaeyoung, Zhang, Byoung-Tak

arXiv.org Artificial IntelligenceSep-22-2022

Coherence is one of the critical factors that determine the quality of writing. We propose writing relevance (WR) training method for neural encoder-decoder natural language generation (NLG) models which improves coherence of the continuation by leveraging negative examples. WR loss regresses the vector representation of the context and generated sentence toward positive continuation by contrasting it with the negatives. We compare our approach with Unlikelihood (UL) training in a text continuation task on commonsense natural language inference (NLI) corpora to show which method better models the coherence by avoiding unlikely continuations. The preference of our approach in human evaluation shows the efficacy of our method in improving coherence.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2209.10922

Country:

Asia > South Korea > Seoul > Seoul (0.06)
Oceania > Australia (0.05)
North America > Canada > Ontario > Toronto (0.05)
(4 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.55)

Add feedback

Automated Coronary Calcium Scoring using U-Net Models through Semi-supervised Learning on Non-Gated CT Scans

Singh, Sanskriti

arXiv.org Artificial IntelligenceSep-21-2022

Every year, thousands of innocent people die due to heart attacks. Often undiagnosed heart attacks can hit people by surprise since many current medical plans don't cover the costs to require the searching of calcification on these scans. Only if someone is suspected to have a heart problem, a gated CT scan is taken, otherwise, there's no way for the patient to be aware of a possible heart attack/disease. While nongated CT scans are more periodically taken, it is harder to detect calcification and is usually taken for a purpose other than locating calcification in arteries. In fact, in real time coronary artery calcification scores are only calculated on gated CT scans, not nongated CT scans. After training a unet model on the Coronary Calcium and chest CT's gated scans, it received a DICE coefficient of 0.95 on its untouched test set. This model was used to predict on nongated CT scans, performing with a mean absolute error (MAE) of 674.19 and bucket classification accuracy of 41% (5 classes). Through the analysis of the images and the information stored in the images, mathematical equations were derived and used to automatically crop the images around the location of the heart. By performing semi-supervised learning the new cropped nongated scans were able to closely resemble gated CT scans, improving the performance by 91% in MAE (62.38) and 23% in accuracy.

artificial intelligence, inductive learning, machine learning, (4 more...)

arXiv.org Artificial Intelligence

2206.10455

Genre: Research Report (0.66)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.60)

Add feedback

SoLar: Sinkhorn Label Refinery for Imbalanced Partial-Label Learning

Wang, Haobo, Xia, Mingxuan, Li, Yixuan, Mao, Yuren, Feng, Lei, Chen, Gang, Zhao, Junbo

arXiv.org Artificial IntelligenceSep-21-2022

Partial-label learning (PLL) is a peculiar weakly-supervised learning task where the training samples are generally associated with a set of candidate labels instead of single ground truth. While a variety of label disambiguation methods have been proposed in this domain, they normally assume a class-balanced scenario that may not hold in many real-world applications. Empirically, we observe degenerated performance of the prior methods when facing the combinatorial challenge from the long-tailed distribution and partial-labeling. In this work, we first identify the major reasons that the prior work failed. We subsequently propose SoLar, a novel Optimal Transport-based framework that allows to refine the disambiguated labels towards matching the marginal class prior distribution. SoLar additionally incorporates a new and systematic mechanism for estimating the long-tailed class prior distribution under the PLL setup. Through extensive experiments, SoLar exhibits substantially superior results on standardized benchmarks compared to the previous state-of-the-art PLL methods. Code and data are available at: https://github.com/hbzju/SoLar .

artificial intelligence, inductive learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2209.10365

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > California (0.04)
Asia > China > Zhejiang Province (0.04)
Asia > China > Chongqing Province > Chongqing (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics

Siddiqui, Shoaib Ahmed, Rajkumar, Nitarshan, Maharaj, Tegan, Krueger, David, Hooker, Sara

arXiv.org Artificial IntelligenceSep-20-2022

Modern machine learning research relies on relatively few carefully curated datasets. Even in these datasets, and typically in `untidy' or raw data, practitioners are faced with significant issues of data quality and diversity which can be prohibitively labor intensive to address. Existing methods for dealing with these challenges tend to make strong assumptions about the particular issues at play, and often require a priori knowledge or metadata such as domain labels. Our work is orthogonal to these methods: we instead focus on providing a unified and efficient framework for Metadata Archaeology -- uncovering and inferring metadata of examples in a dataset. We curate different subsets of data that might exist in a dataset (e.g. mislabeled, atypical, or out-of-distribution examples) using simple transformations, and leverage differences in learning dynamics between these probe suites to infer metadata of interest. Our method is on par with far more sophisticated mitigation methods across different tasks: identifying and correcting mislabeled examples, classifying minority-group samples, prioritizing points relevant for training and enabling scalable human auditing of relevant examples.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2209.10015

Country:

North America > Canada > Ontario > Toronto (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(3 more...)

Genre: Research Report (0.65)

Industry:

Health & Medicine (1.00)
Government (0.68)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Weak Disambiguation for Partial Structured Output Learning

Lu, Xiaolei, Chow, Tommy W. S.

arXiv.org Artificial IntelligenceSep-19-2022

Existing disambiguation strategies for partial structured output learning just cannot generalize well to solve the problem that there are some candidates which can be false positive or similar to the ground-truth label. In this paper, we propose a novel weak disambiguation for partial structured output learning (WD-PSL). First, a piecewise large margin formulation is generalized to partial structured output learning, which effectively avoids handling large number of candidate structured outputs for complex structures. Second, in the proposed weak disambiguation strategy, each candidate label is assigned with a confidence value indicating how likely it is the true label, which aims to reduce the negative effects of wrong ground-truth label assignment in the learning process. Then two large margins are formulated to combine two types of constraints which are the disambiguation between candidates and non-candidates, and the weak disambiguation for candidates. In the framework of alternating optimization, a new 2n-slack variables cutting plane algorithm is developed to accelerate each iteration of optimization. The experimental results on several sequence labeling tasks of Natural Language Processing show the effectiveness of the proposed model.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TCYB.2020.3000053

2209.0941

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Pennsylvania (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.96)
(2 more...)

Add feedback

Machine Learning Class Numbers of Real Quadratic Fields

Amir, Malik, He, Yang-Hui, Lee, Kyu-Hwan, Oliver, Thomas, Sultanow, Eldar

arXiv.org Artificial IntelligenceSep-19-2022

We implement and interpret various supervised learning experiments involving real quadratic fields with class numbers 1, 2 and 3. We quantify the relative difficulties in separating class numbers of matching/different parity from a data-scientific perspective, apply the methodology of feature analysis and principal component analysis, and use symbolic classification to develop machine-learned formulas for class numbers 1, 2 and 3 that apply to our dataset.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2209.09283

Country:

North America > United States > Connecticut > Tolland County > Storrs (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(6 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Add feedback

Overcoming Language Priors in Visual Question Answering via Distinguishing Superficially Similar Instances

Wu, Yike, Zhao, Yu, Zhao, Shiwan, Zhang, Ying, Yuan, Xiaojie, Zhao, Guoqing, Jiang, Ning

arXiv.org Artificial IntelligenceSep-18-2022

Despite the great progress of Visual Question Answering (VQA), current VQA models heavily rely on the superficial correlation between the question type and its corresponding frequent answers (i.e., language priors) to make predictions, without really understanding the input. In this work, we define the training instances with the same question type but different answers as \textit{superficially similar instances}, and attribute the language priors to the confusion of VQA model on such instances. To solve this problem, we propose a novel training framework that explicitly encourages the VQA model to distinguish between the superficially similar instances. Specifically, for each training instance, we first construct a set that contains its superficially similar counterparts. Then we exploit the proposed distinguishing module to increase the distance between the instance and its counterparts in the answer space. In this way, the VQA model is forced to further focus on the other parts of the input beyond the question type, which helps to overcome the language priors. Experimental results show that our method achieves the state-of-the-art performance on VQA-CP v2. Codes are available at \href{https://github.com/wyk-nku/Distinguishing-VQA.git}{Distinguishing-VQA}.

artificial intelligence, machine learning, question type, (19 more...)

arXiv.org Artificial Intelligence

2209.08529

Country: Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback

The Geometry of Self-supervised Learning Models and its Impact on Transfer Learning

Cosentino, Romain, Shekkizhar, Sarath, Soltanolkotabi, Mahdi, Avestimehr, Salman, Ortega, Antonio

arXiv.org Artificial IntelligenceSep-18-2022

Self-supervised learning (SSL) has emerged as a desirable paradigm in computer vision due to the inability of supervised models to learn representations that can generalize in domains with limited labels. The recent popularity of SSL has led to the development of several models that make use of diverse training strategies, architectures, and data augmentation policies with no existing unified framework to study or assess their effectiveness in transfer learning. We propose a data-driven geometric strategy to analyze different SSL models using local neighborhoods in the feature space induced by each. Unlike existing approaches that consider mathematical approximations of the parameters, individual components, or optimization landscape, our work aims to explore the geometric properties of the representation manifolds learned by SSL models. Our proposed manifold graph metrics (MGMs) provide insights into the geometric similarities and differences between available SSL models, their invariances with respect to specific augmentations, and their performances on transfer learning tasks. Our key findings are two fold: (i) contrary to popular belief, the geometry of SSL models is not tied to its training paradigm (contrastive, non-contrastive, and cluster-based); (ii) we can predict the transfer learning capability for a specific model based on the geometric properties of its semantic and augmentation manifolds.

artificial intelligence, inductive learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2209.08622

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.85)

Add feedback

Information-Theoretic Characterization of the Generalization Error for Iterative Semi-Supervised Learning

He, Haiyun, Yan, Hanshu, Tan, Vincent Y. F.

arXiv.org Artificial IntelligenceSep-18-2022

Using information-theoretic principles, we consider the generalization error (gen-error) of iterative semi-supervised learning (SSL) algorithms that iteratively generate pseudo-labels for a large amount of unlabelled data to progressively refine the model parameters. In contrast to most previous works that {\em bound} the gen-error, we provide an {\em exact} expression for the gen-error and particularize it to the binary Gaussian mixture model. Our theoretical results suggest that when the class conditional variances are not too large, the gen-error decreases with the number of iterations, but quickly saturates. On the flip side, if the class conditional variances (and so amount of overlap between the classes) are large, the gen-error increases with the number of iterations. To mitigate this undesirable effect, we show that regularization can reduce the gen-error. The theoretical results are corroborated by extensive experiments on the MNIST and CIFAR datasets in which we notice that for easy-to-distinguish classes, the gen-error improves after several pseudo-labelling iterations, but saturates afterwards, and for more difficult-to-distinguish classes, regularization improves the generalization performance.

artificial intelligence, generalization error, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2110.00926

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Singapore (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.71)
(3 more...)

Add feedback

Self-Supervised Learning with an Information Maximization Criterion

Ozsoy, Serdar, Hamdan, Shadi, Arik, Sercan Ö., Yuret, Deniz, Erdogan, Alper T.

arXiv.org Artificial IntelligenceSep-16-2022

Self-supervised learning allows AI systems to learn effective representations from large amounts of data using tasks that do not require costly labeling. Mode collapse, i.e., the model producing identical representations for all inputs, is a central problem to many self-supervised learning approaches, making self-supervised tasks, such as matching distorted variants of the inputs, ineffective. In this article, we argue that a straightforward application of information maximization among alternative latent representations of the same input naturally solves the collapse problem and achieves competitive empirical results. We propose a self-supervised learning method, CorInfoMax, that uses a second-order statistics-based mutual information measure that reflects the level of correlation among its arguments. Maximizing this correlative information measure between alternative representations of the same input serves two purposes: (1) it avoids the collapse problem by generating feature vectors with non-degenerate covariances; (2) it establishes relevance among alternative representations by increasing the linear dependence among them. An approximation of the proposed information maximization objective simplifies to a Euclidean distance-based objective function regularized by the log-determinant of the feature covariance matrix. The regularization term acts as a natural barrier against feature space degeneracy. Consequently, beyond avoiding complete output collapse to a single point, the proposed approach also prevents dimensional collapse by encouraging the spread of information across the whole feature space. Numerical experiments demonstrate that CorInfoMax achieves better or competitive performance results relative to the state-of-the-art SSL approaches.

artificial intelligence, machine learning, representation, (20 more...)

arXiv.org Artificial Intelligence

2209.07999

Country:

Asia > Middle East > Republic of Türkiye (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > South Carolina > Charleston County > Charleston (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback