AITopics

2303.10365

Country: Asia > China > Chongqing Province > Chongqing (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Kukleva, Anna, Böhle, Moritz, Schiele, Bernt, Kuehne, Hilde, Rupprecht, Christian

Temperature Schedules for Self-Supervised Contrastive Methods on Long-Tail Data

arXiv.org Artificial IntelligenceMar-23-2023

Most approaches for self-supervised learning (SSL) are optimised on curated balanced datasets, e.g. ImageNet, despite the fact that natural data usually exhibits long-tail distributions. In particular, we investigate the role of the temperature parameter τ in the contrastive loss, by analysing the loss through the lens of average distance maximisation, and find that a large τ emphasises group-wise discrimination, whereas a small τ leads to a higher degree of instance discrimination. While τ has thus far been treated exclusively as a constant hyperparameter, in this work, we propose to employ a dynamic τ and show that a simple cosine schedule can yield significant improvements in the learnt representations. Such a schedule results in a constant'task switching' between an emphasis on instance discrimination and group-wise discrimination and thereby ensures that the model learns both group-wise features, as well as instance-specific details. Since frequent classes benefit from the former, while infrequent classes require the latter, we find this method to consistently improve separation between the classes in long-tail data without any additional computational cost. Deep Neural Networks have shown remarkable capabilities at learning representations of their inputs that are useful for a variety of tasks. Especially since the advent of recent self-supervised learning (SSL) techniques, rapid progress towards learning universally useful representations has been made. Currently, however, SSL on images is mainly carried out on benchmark datasets that have been constructed and curated for supervised learning (e.g. Although the labels of curated datasets are not explicitly used in SSL, the structure of the data still follows the predefined set of classes.

artificial intelligence, inductive learning, machine learning, (21 more...)

2303.13664

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Germany > Saarland (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Liu, Fangyu, Emerson, Guy, Collier, Nigel

Visual Spatial Reasoning

arXiv.org Artificial IntelligenceMar-22-2023

Spatial relations are a basic part of human cognition. However, they are expressed in natural language in a variety of ways, and previous work has suggested that current vision-and-language models (VLMs) struggle to capture relational information. In this paper, we present Visual Spatial Reasoning (VSR), a dataset containing more than 10k natural text-image pairs with 66 types of spatial relations in English (such as: under, in front of, and facing). While using a seemingly simple annotation format, we show how the dataset includes challenging linguistic phenomena, such as varying reference frames. We demonstrate a large gap between human and model performance: the human ceiling is above 95%, while state-of-the-art models only achieve around 70%. We observe that VLMs' by-relation performances have little correlation with the number of training examples and the tested models are in general incapable of recognising relations concerning the orientations of objects.

machine learning, natural language, relation, (19 more...)

2205.00363

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(14 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.55)

arXiv.org Artificial IntelligenceMar-22-2023

PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot Learners

Chen, Canyu, Shu, Kai

Recent advances in large pre-trained language models (PLMs) lead to impressive gains in natural language understanding (NLU) tasks with task-specific fine-tuning. However, directly fine-tuning PLMs heavily relies on sufficient labeled training instances, which are usually hard to obtain. Prompt-based tuning on PLMs has shown to be powerful for various downstream few-shot tasks. Existing works studying prompt-based tuning for few-shot NLU tasks mainly focus on deriving proper label words with a verbalizer or generating prompt templates to elicit semantics from PLMs. In addition, conventional data augmentation strategies such as synonym substitution, though widely adopted in low-resource scenarios, only bring marginal improvements for prompt-based few-shot learning. Thus, an important research question arises: how to design effective data augmentation methods for prompt-based few-shot tuning? To this end, considering the label semantics are essential in prompt-based tuning, we propose a novel label-guided data augmentation framework PromptDA, which exploits the enriched label semantic information for data augmentation. Extensive experiment results on few-shot text classification tasks demonstrate the superior performance of the proposed framework by effectively leveraging label semantics and data augmentation for natural language understanding. Our code is available at https://github.com/canyuchen/PromptDA.

machine learning, natural language, text classification, (20 more...)

2205.09229

Country:

North America > United States > Washington > King County > Seattle (0.14)
Europe > Austria > Vienna (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.05)
(9 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.49)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.48)

Cole, Jeremy R., Chaudhary, Aditi, Dhingra, Bhuwan, Talukdar, Partha

Salient Span Masking for Temporal Understanding

arXiv.org Artificial IntelligenceMar-22-2023

Salient Span Masking (SSM) has shown itself to be an effective strategy to improve closed-book question answering performance. SSM extends general masked language model pretraining by creating additional unsupervised training sentences that mask a single entity or date span, thus oversampling factual information. Despite the success of this paradigm, the span types and sampling strategies are relatively arbitrary and not widely studied for other tasks. Thus, we investigate SSM from the perspective of temporal tasks, where learning a good representation of various temporal expressions is important. To that end, we introduce Temporal Span Masking (TSM) intermediate training. First, we find that SSM alone improves the downstream performance on three temporal tasks by an avg. +5.8 points. Further, we are able to achieve additional improvements (avg. +0.29 points) by adding the TSM task. These comprise the new best reported results on the targeted tasks. Our analysis suggests that the effectiveness of SSM stems from the sentences chosen in the training data rather than the mask choice: sentences with entities frequently also contain temporal expressions. Nonetheless, the additional targeted spans of TSM can still improve performance, especially in a zero-shot context.

large language model, machine learning, natural language, (17 more...)

2303.1286

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.05)
Asia > China > Hong Kong (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry: Health & Medicine > Therapeutic Area (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)

Safe Self-Supervised Learning in Real of Visuo-Tactile Feedback Policies for Industrial Insertion

Fu, Letian, Huang, Huang, Berscheid, Lars, Li, Hui, Goldberg, Ken, Chitta, Sachin

Industrial insertion tasks are often performed repetitively with parts that are subject to tight tolerances and prone to breakage. Learning an industrial insertion policy in real is challenging as the collision between the parts and the environment can cause slippage or breakage of the part. In this paper, we present a safe self-supervised method to learn a visuo-tactile insertion policy that is robust to grasp pose variations. The method reduces human input and collisions between the part and the receptacle. The method divides the insertion task into two phases. In the first align phase, a tactile-based grasp pose estimation model is learned to align the insertion part with the receptacle. In the second insert phase, a vision-based policy is learned to guide the part into the receptacle. The robot uses force-torque sensing to achieve a safe self-supervised data collection pipeline. Physical experiments on the USB insertion task from the NIST Assembly Taskboard suggest that the resulting policies can achieve 45/45 insertion successes on 45 different initial grasp poses, improving on two baselines: (1) a behavior cloning agent trained on 50 human insertion demonstrations (1/45) and (2) an online RL policy (TD3) trained in real (0/45).

artificial intelligence, machine learning, receptacle, (18 more...)

2210.0134

Country:

North America > United States > New York (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.40)

Vishniakov, Kirill, Xing, Eric, Shen, Zhiqiang

MixMask: Revisiting Masking Strategy for Siamese ConvNets

Recent advances in self-supervised learning have integrated Masked Image Modeling (MIM) and Siamese Networks into a unified framework that leverages the benefits of both techniques. However, several issues remain unaddressed when applying conventional erase-based masking with Siamese ConvNets. These include (I) the inability to drop uninformative masked regions in ConvNets as they process data continuously, resulting in low training efficiency compared to ViT models; and (II) the mismatch between erase-based masking and the contrastive-based objective in Siamese ConvNets, which differs from the MIM approach. In this paper, we propose a filling-based masking strategy called MixMask to prevent information incompleteness caused by the randomly erased regions in an image in the vanilla masking method. Furthermore, we introduce a flexible loss function design that considers the semantic distance change between two different mixed views to adapt the integrated architecture and prevent mismatches between the transformed input and objective in Masked Siamese ConvNets (MSCN). We conducted extensive experiments on various datasets, including CIFAR-100, Tiny-ImageNet, and ImageNet-1K. The results demonstrate that our proposed framework achieves superior accuracy on linear probing, semi-supervised, and supervised finetuning, outperforming the state-of-the-art MSCN by a significant margin. Additionally, we demonstrate the superiority of our approach in object detection and segmentation tasks. Our source code is available at https://github.com/LightnessOfBeing/MixMask.

artificial intelligence, machine learning, representation, (17 more...)

2210.11456

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.49)

Banerjee, Debopriyo, Jain, Mausam, Kulkarni, Ashish

MFBE: Leveraging Multi-Field Information of FAQs for Efficient Dense Retrieval

In the domain of question-answering in NLP, the retrieval of Frequently Asked Questions (FAQ) is an important sub-area which is well researched and has been worked upon for many languages. Here, in response to a user query, a retrieval system typically returns the relevant FAQs from a knowledge-base. The efficacy of such a system depends on its ability to establish semantic match between the query and the FAQs in real-time. The task becomes challenging due to the inherent lexical gap between queries and FAQs, lack of sufficient context in FAQ titles, scarcity of labeled data and high retrieval latency. In this work, we propose a bi-encoder-based query-FAQ matching model that leverages multiple combinations of FAQ fields (like, question, answer, and category) both during model training and inference. Our proposed Multi-Field Bi-Encoder (MFBE) model benefits from the additional context resulting from multiple FAQ fields and performs well even with minimal labeled data. We empirically support this claim through experiments on proprietary as well as open-source public datasets in both unsupervised and supervised settings. Our model achieves around 27% and 23% better top-1 accuracy for the FAQ retrieval task on internal and open datasets, respectively over the best performing baseline.

artificial intelligence, machine learning, natural language, (18 more...)

2302.11953

Country: Asia > India (0.04)

Genre: Frequently Asked Questions (FAQ) (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.46)

Delilbasic, Amer, Saux, Bertrand Le, Riedel, Morris, Michielsen, Kristel, Cavallaro, Gabriele

A Single-Step Multiclass SVM based on Quantum Annealing for Remote Sensing Data Classification

In recent years, the development of quantum annealers has enabled experimental demonstrations and has increased research interest in applications of quantum annealing, such as in quantum machine learning and in particular for the popular quantum SVM. Several versions of the quantum SVM have been proposed, and quantum annealing has been shown to be effective in them. Extensions to multiclass problems have also been made, which consist of an ensemble of multiple binary classifiers. This work proposes a novel quantum SVM formulation for direct multiclass classification based on quantum annealing, called Quantum Multiclass SVM (QMSVM). The multiclass classification problem is formulated as a single Quadratic Unconstrained Binary Optimization (QUBO) problem solved with quantum annealing. The main objective of this work is to evaluate the feasibility, accuracy, and time performance of this approach. Experiments have been performed on the D-Wave Advantage quantum annealer for a classification problem on remote sensing data. The results indicate that, despite the memory demands of the quantum annealer, QMSVM can achieve accuracy that is comparable to standard SVM methods and, more importantly, it scales much more efficiently with the number of training examples, resulting in nearly constant time. This work shows an approach for bringing together classical and quantum computation, solving practical problems in remote sensing with current hardware.

artificial intelligence, inductive learning, machine learning, (16 more...)

2303.11705

Country:

Europe > Germany > Brandenburg > Potsdam (0.06)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
Europe > Iceland > Capital Region > Reykjavik (0.04)
(9 more...)

Genre: Research Report (0.82)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.83)
Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.49)

Addressing Class Variable Imbalance in Federated Semi-supervised Learning

Dong, Zehui, Liu, Wenjing, Liu, Siyuan, Chen, Xingzhi

Federated Semi-supervised Learning (FSSL) combines techniques from both fields of federated and semi-supervised learning to improve the accuracy and performance of models in a distributed environment by using a small fraction of labeled data and a large amount of unlabeled data. Without the need to centralize all data in one place for training, it collect updates of model training after devices train models at local, and thus can protect the privacy of user data. However, during the federal training process, some of the devices fail to collect enough data for local training, while new devices will be included to the group training. This leads to an unbalanced global data distribution and thus affect the performance of the global model training. Most of the current research is focusing on class imbalance with a fixed number of classes, while little attention is paid to data imbalance with a variable number of classes. Therefore, in this paper, we propose Federated Semi-supervised Learning for Class Variable Imbalance (FCVI) to solve class variable imbalance. The class-variable learning algorithm is used to mitigate the data imbalance due to changes of the number of classes. Our scheme is proved to be significantly better than baseline methods, while maintaining client privacy.

computer science & information technology, imbalance, unlabeled data, (12 more...)

doi: 10.5121/csit.2023.130522

2303.11809

Country:

Asia > Mongolia (0.05)
Asia > China > Inner Mongolia > Hohhot (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)