AITopics

2009.05175

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > South Korea > Seoul > Seoul (0.04)
Asia > China > Hong Kong (0.04)
(14 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
(3 more...)

#artificialintelligenceOct-29-2022, 15:50:23 GMT

GitHub - facebookresearch/vissl: VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.

Below we share, in reverse chronological order, the updates and new releases in VISSL. All VISSL releases are available here. VISSL is a computer VIsion library for state-of-the-art Self-Supervised Learning research with PyTorch. VISSL aims to accelerate research cycle in self-supervised learning: from designing a new self-supervised task to evaluating the learned representations. Benchmark suite: Variety of benchmarks tasks including linear image classification (places205, imagenet1k, voc07, food, CLEVR, dsprites, UCF101, stanford cars and many more), full finetuning, semi-supervised benchmark, nearest neighbor benchmark, object detection (Pascal VOC and COCO).

library, sota self-supervised learning, vissl, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.85)

Spectrograms Are Sequences of Patches

Zhao, Leyi, Li, Yi

Self-supervised pre-training models have been used successfully in several machine learning domains. However, only a tiny amount of work is related to music. In our work, we treat a spectrogram of music as a series of patches and design a self-supervised model that captures the features of these sequential patches: Patchifier, which makes good use of self-supervised learning methods from both NLP and CV domains. We do not use labeled data for the pre-training process, only a subset of the MTAT dataset containing 16k music clips. After pre-training, we apply the model to several downstream tasks. Our model achieves a considerably acceptable result compared to other audio representation models. Meanwhile, our work demonstrates that it makes sense to consider audio as a series of patch segments.

artificial intelligence, inductive learning, machine learning, (16 more...)

2210.15988

Country:

North America > United States > Virginia (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.56)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Review on Classification Techniques used in Biophysiological Stress Monitoring

Iqbal, Talha, Elahi, Adnan, Shahzad, Atif, Wijns, William

Cardiovascular activities are directly related to the response of a body in a stressed condition. Stress, based on its intensity, can be divided into two types i.e. Acute stress (short-term stress) and Chronic stress (long-term stress). Repeated acute stress and continuous chronic stress may play a vital role in inflammation in the circulatory system and thus leads to a heart attack or to a stroke. In this study, we have reviewed commonly used machine learning classification techniques applied to different stress-indicating parameters used in stress monitoring devices. These parameters include Photoplethysmograph (PPG), Electrocardiographs (ECG), Electromyograph (EMG), Galvanic Skin Response (GSR), Heart Rate Variation (HRV), skin temperature, respiratory rate, Electroencephalograph (EEG) and salivary cortisol, used in stress monitoring devices. This study also provides a discussion on choosing a classifier, which depends upon a number of factors other than accuracy, like the number of subjects involved in an experiment, type of signals processing and computational limitations.

artificial intelligence, classifier, machine learning, (22 more...)

2210.1604

Country:

North America > United States (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
Europe > Netherlands > Limburg > Maastricht (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.88)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(4 more...)

Ovsianas, Andrius, Ramapuram, Jason, Busbridge, Dan, Dhekane, Eeshan Gunesh, Webb, Russ

Elastic Weight Consolidation Improves the Robustness of Self-Supervised Learning Methods under Transfer

Self-supervised representation learning (SSL) methods provide an effective label-free initial condition for fine-tuning downstream tasks. However, in numerous realistic scenarios, the downstream task might be biased with respect to the target label distribution. This in turn moves the learned fine-tuned model posterior away from the initial (label) bias-free self-supervised model posterior. In this work, we re-interpret SSL fine-tuning under the lens of Bayesian continual learning and consider regularization through the Elastic Weight Consolidation (EWC) framework. We demonstrate that self-regularization against an initial SSL backbone improves worst sub-group performance in Waterbirds by 5% and Celeb-A by 2% when using the ViT-B/16 architecture. Furthermore, to help simplify the use of EWC with SSL, we pre-compute and publicly release the Fisher Information Matrix (FIM), evaluated with 10,000 ImageNet-1K variates evaluated on large modern SSL architectures including ViT-B/16 and ResNet50 trained with DINO.

artificial intelligence, machine learning, vit-b 16, (17 more...)

2210.16365

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Florida > Miami-Dade County > Miami (0.04)
(4 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)

Chaudhry, Arslan, Menon, Aditya Krishna, Veit, Andreas, Jayasumana, Sadeep, Ramalingam, Srikumar, Kumar, Sanjiv

When does mixup promote local linearity in learned representations?

Mixup is a regularization technique that artificially produces new samples using convex combinations of original training points. This simple technique has shown strong empirical performance, and has been heavily used as part of semi-supervised learning techniques such as mixmatch~\citep{berthelot2019mixmatch} and interpolation consistent training (ICT)~\citep{verma2019interpolation}. In this paper, we look at Mixup through a \emph{representation learning} lens in a semi-supervised learning setup. In particular, we study the role of Mixup in promoting linearity in the learned network representations. Towards this, we study two questions: (1) how does the Mixup loss that enforces linearity in the \emph{last} network layer propagate the linearity to the \emph{earlier} layers?; and (2) how does the enforcement of stronger Mixup loss on more than two data points affect the convergence of training? We empirically investigate these properties of Mixup on vision datasets such as CIFAR-10, CIFAR-100 and SVHN. Our results show that supervised Mixup training does not make \emph{all} the network layers linear; in fact the \emph{intermediate layers} become more non-linear during Mixup training compared to a network that is trained \emph{without} Mixup. However, when Mixup is used as an unsupervised loss, we observe that all the network layers become more linear resulting in faster training convergence.

artificial intelligence, inductive learning, machine learning, (16 more...)

2210.16413

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Mohamadi, Salman, Doretto, Gianfranco, Adjeroh, Donald A.

FUSSL: Fuzzy Uncertain Self Supervised Learning

arXiv.org Artificial IntelligenceOct-27-2022

Self supervised learning (SSL) has become a very successful technique to harness the power of unlabeled data, with no annotation effort. A number of developed approaches are evolving with the goal of outperforming supervised alternatives, which have been relatively successful. One main issue in SSL is robustness of the approaches under different settings. In this paper, for the first time, we recognize the fundamental limits of SSL coming from the use of a single-supervisory signal. To address this limitation, we leverage the power of uncertainty representation to devise a robust and general standard hierarchical learning/training protocol for any SSL baseline, regardless of their assumptions and approaches. Essentially, using the information bottleneck principle, we decompose feature learning into a two-stage training procedure, each with a distinct supervision signal. This double supervision approach is captured in two key steps: 1) invariance enforcement to data augmentation, and 2) fuzzy pseudo labeling (both hard and soft annotation). This simple, yet, effective protocol which enables cross-class/cluster feature learning, is instantiated via an initial training of an ensemble of models through invariance enforcement to data augmentation as first training phase, and then assigning fuzzy labels to the original samples for the second training phase. We consider multiple alternative scenarios with double supervision and evaluate the effectiveness of our approach on recent baselines, covering four different SSL paradigms, including geometrical, contrastive, non-contrastive, and hard/soft whitening (redundancy reduction) baselines. Extensive experiments under multiple settings show that the proposed training protocol consistently improves the performance of the former baselines, independent of their respective underlying principles.

artificial intelligence, baseline, machine learning, (17 more...)

2210.15818

Country:

North America > United States > West Virginia > Monongalia County > Morgantown (0.04)
North America > United States > Colorado (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Hsu, Hans Hao-Hsun, Shen, Yuesong, Cremers, Daniel

A Graph Is More Than Its Nodes: Towards Structured Uncertainty-Aware Learning on Graphs

arXiv.org Artificial IntelligenceOct-27-2022

Current graph neural networks (GNNs) that tackle node classification on graphs tend to only focus on nodewise scores and are solely evaluated by nodewise metrics. This limits uncertainty estimation on graphs since nodewise marginals do not fully characterize the joint distribution given the graph structure. In this work, we propose novel edgewise metrics, namely the edgewise expected calibration error (ECE) and the agree/disagree ECEs, which provide criteria for uncertainty estimation on graphs beyond the nodewise setting. Our experiments demonstrate that the proposed edgewise metrics can complement the nodewise results and yield additional insights. Moreover, we show that GNN models which consider the structured prediction problem on graphs tend to have better uncertainty estimations, which illustrates the benefit of going beyond the nodewise setting.

artificial intelligence, machine learning, uncertainty estimation, (15 more...)

2210.15575

Country:

Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.49)

Yeh, Chih-Kuan, Taly, Ankur, Sundararajan, Mukund, Liu, Frederick, Ravikumar, Pradeep

First is Better Than Last for Language Data Influence

arXiv.org Artificial IntelligenceOct-27-2022

The ability to identify influential training examples enables us to debug training data and explain model behavior. Existing techniques to do so are based on the flow of training data influence through the model parameters. For large models in NLP applications, it is often computationally infeasible to study this flow through all model parameters, therefore techniques usually pick the last layer of weights. However, we observe that since the activation connected to the last layer of weights contains "shared logic", the data influenced calculated via the last layer weights prone to a ``cancellation effect'', where the data influence of different examples have large magnitude that contradicts each other. The cancellation effect lowers the discriminative power of the influence score, and deleting influential examples according to this measure often does not change the model's behavior by much. To mitigate this, we propose a technique called TracIn-WE that modifies a method called TracIn to operate on the word embedding layer instead of the last layer, where the cancellation effect is less severe. One potential concern is that influence based on the word embedding layer may not encode sufficient high level information. However, we find that gradients (unlike embeddings) do not suffer from this, possibly because they chain through higher layers. We show that TracIn-WE significantly outperforms other data influence methods applied on the last layer significantly on the case deletion evaluation on three language classification tasks for different models. In addition, TracIn-WE can produce scores not just at the level of the overall training input, but also at the level of words within the training input, a further aid in debugging.

machine learning, natural language, tracin-we, (22 more...)

2202.11844

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Texas (0.04)
North America > United States > Montana (0.04)
(16 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Banking & Finance > Trading (0.93)
Leisure & Entertainment > Sports > Baseball (0.93)
Law (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Li, Chak-Fai, Keith, Francis, Hartmann, William, Snover, Matthew

Training Autoregressive Speech Recognition Models with Limited in-domain Supervision

arXiv.org Artificial IntelligenceOct-26-2022

Advances in self-supervised learning have significantly reduced the amount of transcribed audio required for training. However, the majority of work in this area is focused on read speech. We explore limited supervision in the domain of conversational speech. While we assume the amount of in-domain data is limited, we augment the model with open source read speech data. The XLS-R model has been shown to perform well with limited adaptation data and serves as a strong baseline. We use untranscribed data for self-supervised learning and semi-supervised training in an autoregressive encoder-decoder model. We demonstrate that by using the XLS-R model for pseudotranscription, a much smaller autoregressive model can outperform a finetuned XLS-R model when transcribed in-domain data is limited, reducing WER by as much as 8% absolute.

artificial intelligence, inductive learning, machine learning, (15 more...)

2210.15135

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Industry: Government (0.69)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.56)