AITopics

2304.14293

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(11 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.68)
(5 more...)

Liu, Shuwen, Grau, Bernardo Cuenca, Horrocks, Ian, Kostylev, Egor V.

Revisiting Inferential Benchmarks for Knowledge Graph Completion

Knowledge Graph (KG) completion is the problem of extending an incomplete KG with missing facts. A key feature of Machine Learning approaches for KG completion is their ability to learn inference patterns, so that the predicted facts are the results of applying these patterns to the KG. Standard completion benchmarks, however, are not well-suited for evaluating models' abilities to learn patterns, because the training and test sets of these benchmarks are a random split of a given KG and hence do not capture the causality of inference patterns. We propose a novel approach for designing KG completion benchmarks based on the following principles: there is a set of logical rules so that the missing facts are the results of the rules' application; the training set includes both premises matching rule antecedents and the corresponding conclusions; the test set consists of the results of applying the rules to the training set; the negative examples are designed to discourage the models from learning rules not entailed by the rule set. We use our methodology to generate several benchmarks and evaluate a wide range of existing KG completion systems. Our results provide novel insights on the ability of existing models to induce inference patterns from incomplete KGs.

artificial intelligence, benchmark, machine learning, (18 more...)

2306.04814

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Norway > Eastern Norway > Oslo (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
(2 more...)

Thakur, Himanshu, Jain, Atishay, Vaddamanu, Praneetha, Liang, Paul Pu, Morency, Louis-Philippe

Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions

Societal biases present in pre-trained large language models are a critical issue as these models have been shown to propagate biases in countless downstream applications, rendering them unfair towards specific groups of people. Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trained model. While the majority of current state-of-the-art debiasing methods focus on changes to the training regime, in this paper, we propose data intervention strategies as a powerful yet simple technique to reduce gender bias in pre-trained models. Specifically, we empirically show that by fine-tuning a pre-trained model on only 10 de-biased (intervened) training examples, the tendency to favor any gender is significantly reduced. Since our proposed method only needs a few training examples, our few-shot debiasing approach is highly feasible and practical. Through extensive experimentation, we show that our debiasing technique performs better than competitive state-of-the-art baselines with minimal loss in language modeling ability.

large language model, machine learning, natural language, (16 more...)

2306.04597

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(8 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

Liang, Paul Pu, Ling, Chun Kai, Cheng, Yun, Obolenskiy, Alex, Liu, Yudong, Pandey, Rohan, Wilf, Alex, Morency, Louis-Philippe, Salakhutdinov, Ruslan

In many machine learning systems that jointly learn from multiple modalities, a core research question is to understand the nature of multimodal interactions: the emergence of new task-relevant information during learning from both modalities that was not present in either alone. We study this challenge of interaction quantification in a semi-supervised setting with only labeled unimodal data and naturally co-occurring multimodal data (e.g., unlabeled images and captions, video and corresponding audio) but when labeling them is time-consuming. Using a precise information-theoretic definition of interactions, our key contributions are the derivations of lower and upper bounds to quantify the amount of multimodal interactions in this semi-supervised setting. We propose two lower bounds based on the amount of shared information between modalities and the disagreement between separately trained unimodal classifiers, and derive an upper bound through connections to approximate algorithms for min-entropy couplings. We validate these estimated bounds and show how they accurately track true interactions. Finally, two semi-supervised multimodal applications are explored based on these theoretical results: (1) analyzing the relationship between multimodal performance and estimated interactions, and (2) self-supervised learning that embraces disagreement between modalities beyond agreement as is typically done.

artificial intelligence, machine learning, synergy, (19 more...)

2306.04539

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre:

Research Report > New Finding (0.66)
Research Report > Experimental Study (0.66)
Research Report > Promising Solution (0.45)

Industry:

Health & Medicine (1.00)
Education (0.67)
Government > Regional Government > North America Government > United States Government (0.45)
Information Technology > Security & Privacy (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

He, Jie, U, Simon Chi Lok, Gutiérrez-Basulto, Víctor, Pan, Jeff Z.

BUCA: A Binary Classification Approach to Unsupervised Commonsense Question Answering

Unsupervised commonsense reasoning (UCR) is becoming increasingly popular as the construction of commonsense reasoning datasets is expensive, and they are inevitably limited in their scope. A popular approach to UCR is to fine-tune language models with external knowledge (e.g., knowledge graphs), but this usually requires a large number of training examples. In this paper, we propose to transform the downstream multiple choice question answering task into a simpler binary classification task by ranking all candidate answers according to their reasonableness. To this end, for training the model, we convert the knowledge graph triples into reasonable and unreasonable texts. Extensive experimental results show the effectiveness of our approach on various multiple choice question answering benchmarks. Furthermore, compared with existing UCR approaches using KGs, ours is less data hungry. Our code is available at https://github.com/probe2/BUCA.

computational linguistic, machine learning, question answering, (19 more...)

2305.15932

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > Dominican Republic (0.04)
(5 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Ke, Chuyang, Honorio, Jean

Partial Inference in Structured Prediction

In the past decades, various forms of structured prediction have been used extensively across many fields, including computer vision, natural language processing, network analysis, computational chemistry, to name a few. In these fields, examples of structured prediction problems include foreground / background detection in a digital image [Nowozin et al., 2011], grammatical partof-speech tagging in an English sentence [Weiss and Taskar, 2010], community identification and clustering in social networks [Kelley et al., 2012], and identifying representative subsets of millions of chemical compounds [Downs and Barnard, 2002]. On a higher level, all of the structured prediction inference problems mentioned above seek to maximize some score function over the space of labels. In other words, a common goal in inference tasks is to recover the label of each entity, such that the prediction matches the observation as much as possible. Suppose we represent the structured prediction inference problem using an undirected graph G = (V, E), where each node represents an entity, and each edge represents the interaction between two nodes.

artificial intelligence, inductive learning, machine learning, (18 more...)

2306.03949

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Shen, Li-Hsiang, Chen, Kai-Jui, Hsiao, An-Hung, Feng, Kai-Ten

BTS: Bifold Teacher-Student in Semi-Supervised Learning for Indoor Two-Room Presence Detection Under Time-Varying CSI

In recent years, indoor human presence detection based on supervised learning (SL) and channel state information (CSI) has attracted much attention. However, existing studies that rely on spatial information of CSI are susceptible to environmental changes which degrade prediction accuracy. Moreover, SL-based methods require time-consuming data labeling for retraining models. Therefore, it is imperative to design a continuously monitored model using a semi-supervised learning (SSL) based scheme. In this paper, we conceive a bifold teacher-student (BTS) learning approach for indoor human presence detection in an adjoining two-room scenario. The proposed SSL-based primal-dual teacher-student network intelligently learns spatial and temporal features from labeled and unlabeled CSI datasets. Additionally, the enhanced penalized loss function leverages entropy and distance measures to distinguish drifted data, i.e., features of new datasets affected by time-varying effects and altered from the original distribution. Experimental results demonstrate that the proposed BTS system sustains asymptotic accuracy after retraining the model with unlabeled data. Furthermore, BTS outperforms existing SSL-based models in terms of the highest detection accuracy while achieving the asymptotic performance of SL-based methods.

accuracy, artificial intelligence, machine learning, (19 more...)

2212.10802

Country:

North America > United States > California (0.04)
Asia > Taiwan (0.04)

Genre: Research Report (0.70)

Industry:

Education (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.82)

Subgraph Networks Based Contrastive Learning

Wang, Jinhuan, Shao, Jiafei, Wang, Zeyu, Yu, Shanqing, Xuan, Qi, Yang, Xiaoniu

Graph contrastive learning (GCL), as a self-supervised learning method, can solve the problem of annotated data scarcity. It mines explicit features in unannotated graphs to generate favorable graph representations for downstream tasks. Most existing GCL methods focus on the design of graph augmentation strategies and mutual information estimation operations. Graph augmentation produces augmented views by graph perturbations. These views preserve a locally similar structure and exploit explicit features. However, these methods have not considered the interaction existing in subgraphs. To explore the impact of substructure interactions on graph representations, we propose a novel framework called subgraph network-based contrastive learning (SGNCL). SGNCL applies a subgraph network generation strategy to produce augmented views. This strategy converts the original graph into an Edge-to-Node mapping network with both topological and attribute features. The single-shot augmented view is a first-order subgraph network that mines the interaction between nodes, node-edge, and edges. In addition, we also investigate the impact of the second-order subgraph augmentation on mining graph structure interactions, and further, propose a contrastive objective that fuses the first-order and second-order subgraph information. We compare SGNCL with classical and state-of-the-art graph contrastive learning methods on multiple benchmark datasets of different domains. Extensive experiments show that SGNCL achieves competitive or better performance (top three) on all datasets in unsupervised learning settings. Furthermore, SGNCL achieves the best average gain of 6.9\% in transfer learning compared to the best method. Finally, experiments also demonstrate that mining substructure interactions have positive implications for graph contrastive learning.

artificial intelligence, machine learning, representation, (17 more...)

2306.03506

Country:

Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (0.47)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.34)

Anagnostopoulou, Aliki, Hartmann, Mareike, Sonntag, Daniel

Putting Humans in the Image Captioning Loop

Image Captioning (IC) models can highly benefit from human feedback in the training process, especially in cases where data is limited. We present work-in-progress on adapting an IC system to integrate human feedback, with the goal to make it easily adaptable to user-specific data. Our approach builds on a base IC model pre-trained on the MS COCO dataset, which generates captions for unseen images. The user will then be able to offer feedback on the image and the generated/predicted caption, which will be augmented to create additional training instances for the adaptation of the model. The additional instances are integrated into the model using step-wise updates, and a sparse memory replay component is used to avoid catastrophic forgetting. We hope that this approach, while leading to improved results, will also result in customizable IC models.

artificial intelligence, caption, machine learning, (17 more...)

2306.03476

Country:

Europe > Germany (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Mannix, Evelyn J., Bondell, Howard D.

Cold PAWS: Unsupervised class discovery and addressing the cold-start problem for semi-supervised learning

In many machine learning applications, labeling datasets can be an arduous and time-consuming task. Although research has shown that semi-supervised learning techniques can achieve high accuracy with very few labels within the field of computer vision, little attention has been given to how images within a dataset should be selected for labeling. In this paper, we propose a novel approach based on well-established self-supervised learning, clustering, and manifold learning techniques that address this challenge of selecting an informative image subset to label in the first instance, which is known as the cold-start or unsupervised selective labelling problem. We test our approach using several publicly available datasets, namely CIFAR10, Imagenette, DeepWeeds, and EuroSAT, and observe improved performance with both supervised and semi-supervised learning strategies when our label selection strategy is used, in comparison to random sampling. We also obtain superior performance for the datasets considered with a much simpler approach compared to other methods in the literature.

artificial intelligence, inductive learning, machine learning, (18 more...)

2305.10071

Country:

North America > United States (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Education (0.49)
Health & Medicine > Diagnostic Medicine (0.46)
Transportation (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)