AITopics

2110.00467

Country: North America > United States > Pennsylvania (0.04)

Genre: Research Report > Experimental Study (0.87)

Industry:

Health & Medicine > Therapeutic Area > Hematology (0.49)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Learning in High Dimensional Spaces (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

#artificialintelligenceSep-26-2021, 07:05:11 GMT

How to Use Arabic Word2Vec Word Embedding with LSTM

Word embedding is the approach of learning word and their relative meanings from a corpus of text and representing the word as a dense vector. The word vector is the projection of the word into a continuous feature vector space, see Figure 1 (A) for clarity. Words that have similar meaning should be close together in the vector space as illustrated in see Figure 1 (B). Word2vec is one of the most popular words embedding in NLP. Word2vec has two types, Continuous Bag-of-Words Model (CBOW) and Continuous Skip-gram Model [3], the model architectures are shown in Figure 2. CBOW predicts the word according to the given context, where Skip-gram predicts the context according to the given word, which increases the computational complexity [3].

sequence, use arabic word2vec word embedding, word2vec word embedding, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.81)

Jones, Alex, Wang, William Yang, Mahowald, Kyle

A Massively Multilingual Analysis of Cross-linguality in Shared Embedding Space

arXiv.org Artificial IntelligenceSep-13-2021

In cross-lingual language models, representations for many different languages live in the same space. Here, we investigate the linguistic and non-linguistic factors affecting sentence-level alignment in cross-lingual pretrained language models for 101 languages and 5,050 language pairs. Using BERT-based LaBSE and BiLSTM-based LASER as our models, and the Bible as our corpus, we compute a task-based measure of cross-lingual alignment in the form of bitext retrieval performance, as well as four intrinsic measures of vector space alignment and isomorphism. We then examine a range of linguistic, quasi-linguistic, and training-related features as potential predictors of these alignment metrics. The results of our analyses show that word order agreement and agreement in morphological complexity are two of the strongest linguistic predictors of cross-linguality. We also note in-family training data as a stronger predictor than language-specific training data across the board. We verify some of our linguistic findings by looking at the effect of morphological segmentation on English-Inuktitut alignment, in addition to examining the effect of word order agreement on isomorphism for 66 zero-shot language pairs from a different corpus. We make the data and code for our experiments publicly available.

computational linguistic, isomorphism, laser, (15 more...)

2109.06324

Country:

Europe > Italy > Tuscany > Florence (0.05)
North America > United States > California > Santa Barbara County > Santa Barbara (0.04)
North America > Canada > Nunavut (0.04)
(14 more...)

Genre: Research Report > New Finding (1.00)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Abnar, Samira, Berg, Rianne van den, Ghiasi, Golnaz, Dehghani, Mostafa, Kalchbrenner, Nal, Sedghi, Hanie

Gradual Domain Adaptation in the Wild:When Intermediate Distributions are Absent

arXiv.org Artificial IntelligenceJun-10-2021

We focus on the problem of domain adaptation when the goal is shifting the model towards the target distribution, rather than learning domain invariant representations. It has been shown that under the following two assumptions: (a) access to samples from intermediate distributions, and (b) samples being annotated with the amount of change from the source distribution, self-training can be successfully applied on gradually shifted samples to adapt the model toward the target distribution. We hypothesize having (a) is enough to enable iterative self-training to slowly adapt the model to the target distribution, by making use of an implicit curriculum. In the case where (a) does not hold, we observe that iterative self-training falls short. We propose GIFT, a method that creates virtual samples from intermediate distributions by interpolating representations of examples from source and target domains. We evaluate an iterative-self-training method on datasets with natural distribution shifts, and show that when applied on top of other domain adaptation methods, it improves the performance of the model on the target dataset. We run an analysis on a synthetic dataset to show that in the presence of (a) iterative-self-training naturally forms a curriculum of samples. Furthermore, we show that when (a) does not hold, GIFT performs better than iterative self-training.

domain adaptation, representation, target domain, (17 more...)

2106.0608

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Yang, Yinchong, Buettner, Florian

Multi-output Gaussian Processes for Uncertainty-aware Recommender Systems

arXiv.org Machine LearningJun-8-2021

A database describing such user-item interactions often takes the form of a matrix, where each entry describes the interaction between one user and one item. The overall Recommender systems are often designed based rating or purchasing pattern of a user can therefore be described on a collaborative filtering approach, where user by the corresponding row in such a matrix. However, preferences are predicted by modelling interactions since there are typically large numbers of users and items between users and items. Many common approaches in the database, and each user is usually only interested in to solve the collaborative filtering task a small subset of items, this user-item matrix is often large are based on learning representations of users and and sparse. It is therefore inefficient to define the similarity items, including simple matrix factorization, Gaussian between users in the high dimensional feature space defined process latent variable models, and neuralnetwork by all items. Instead, it is more advantageous to derive abstract based embeddings. While matrix factorization feature vectors that represent users and items, which approaches fail to model nonlinear relations, inspired a large variety of low-rank matrix decomposition neural networks can potentially capture such models such as non-negative matrix decomposition [Zhang complex relations with unprecedented predictive et al., 2006], biased matrix decomposition [Koren et al., power and are highly scalable. However, neither 2009] and non-parametric decomposition [Yu et al., 2009]. of them is able to model predictive uncertainties. These methods aim at learning low dimensional representations In contrast, Gaussian Process based models can for all users and items, allowing for the prediction of generate a predictive distribution, but cannot scale the unobserved interaction between a new pair of user and to large amounts of data.

covariance matrix, matrix, representation, (15 more...)

2106.04221

Country: Asia > Middle East > Lebanon (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

arXiv.org Artificial IntelligenceJun-6-2021

DAMSL: Domain Agnostic Meta Score-based Learning

Cai, John, Cai, Bill, Shen, Shengmei

In this paper, we propose Domain Agnostic Meta Score-based Learning (DAMSL), a novel, versatile and highly effective solution that delivers significant out-performance over state-of-the-art methods for cross-domain few-shot learning. We identify key problems in previous meta-learning methods over-fitting to the source domain, and previous transfer-learning methods under-utilizing the structure of the support set. The core idea behind our method is that instead of directly using the scores from a fine-tuned feature encoder, we use these scores to create input coordinates for a domain agnostic metric space. A graph neural network is applied to learn an embedding and relation function over these coordinates to process all information contained in the score distribution of the support set. We test our model on both established CD-FSL benchmarks and new domains and show that our method overcomes the limitations of previous meta-learning and transfer-learning methods to deliver substantial improvements in accuracy across both smaller and larger domain shifts.

damsl, fine-tuning, learning, (12 more...)

2106.03041

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.84)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Tjøstheim, Dag, Jullum, Martin, Løland, Anders

Statistical embedding: Beyond principal components

arXiv.org Machine LearningJun-3-2021

There has been an intense recent activity in embedding of very high dimensional and nonlinear data structures, much of it in the data science and machine learning literature. We survey this activity in four parts. In the first part we cover nonlinear methods such as principal curves, multidimensional scaling, local linear methods, ISOMAP, graph based methods and kernel based methods. The second part is concerned with topological embedding methods, in particular mapping topological properties into persistence diagrams. Another type of data sets with a tremendous growth is very high-dimensional network data. The task considered in part three is how to embed such data in a vector space of moderate dimension to make the data amenable to traditional techniques such as cluster and classification techniques. The final part of the survey deals with embedding in $\mathbb{R}^2$, which is visualization. Three methods are presented: $t$-SNE, UMAP and LargeVis based on methods in parts one, two and three, respectively. The methods are illustrated and compared on two simulated data sets; one consisting of a triple of noisy Ranunculoid curves, and one consisting of networks of increasing complexity and with two types of nodes.

artificial intelligence, machine learning, statistical embedding, (17 more...)

2106.01858

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.04)
North America > United States > Nevada (0.04)
(7 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (0.87)
Information Technology > Networks (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(5 more...)

arXiv.org Machine LearningJun-1-2021

Weighting vectors for machine learning: numerical harmonic analysis applied to boundary detection

Bunch, Eric, Kline, Jeffery, Dickinson, Daniel, Bhat, Suhaas, Fung, Glenn

Metric space magnitude, an active field of research in algebraic topology, is a scalar quantity that summarizes the effective number of distinct points that live in a general metric space. The {\em weighting vector} is a closely-related concept that captures, in a nontrivial way, much of the underlying geometry of the original metric space. Recent work has demonstrated that when the metric space is Euclidean, the weighting vector serves as an effective tool for boundary detection. We recast this result and show the weighting vector may be viewed as a solution to a kernelized SVM. As one consequence, we apply this new insight to the task of outlier detection, and we demonstrate performance that is competitive or exceeds performance of state-of-the-art techniques on benchmark data sets. Under mild assumptions, we show the weighting vector, which has computational cost of matrix inversion, can be efficiently approximated in linear time. We show how nearest neighbor methods can approximate solutions to the minimization problems defined by SVMs.

magnitude, vector, weighting vector, (16 more...)

2106.00827

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Wisconsin > Dane County > Madison (0.05)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.96)

arXiv.org Artificial IntelligenceMay-25-2021

Learning to Bridge Metric Spaces: Few-shot Joint Learning of Intent Detection and Slot Filling

Hou, Yutai, Lai, Yongkui, Chen, Cheng, Che, Wanxiang, Liu, Ting

In this paper, we investigate few-shot joint learning for dialogue language understanding. Most existing few-shot models learn a single task each time with only a few examples. However, dialogue language understanding contains two closely related tasks, i.e., intent detection and slot filling, and often benefits from jointly learning the two tasks. This calls for new few-shot learning techniques that are able to capture task relations from only a few examples and jointly learn multiple tasks. To achieve this, we propose a similarity-based few-shot learning scheme, named Contrastive Prototype Merging network (ConProm), that learns to bridge metric spaces of intent and slot on data-rich domains, and then adapt the bridged metric space to the specific few-shot domain. Experiments on two public datasets, Snips and FewJoint, show that our model significantly outperforms the strong baselines in one and five shots settings.

computational linguistic, proc, prototype, (15 more...)

2106.07343

Country:

Asia > China > Hong Kong (0.05)
Europe > Italy > Tuscany > Florence (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(6 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.83)

Khan, Aleem, Fleming, Elizabeth, Schofield, Noah, Bishop, Marcus, Andrews, Nicholas

A Deep Metric Learning Approach to Account Linking

arXiv.org Artificial IntelligenceMay-15-2021

We consider the task of linking social media accounts that belong to the same author in an automated fashion on the basis of the content and metadata of their corresponding document streams. We focus on learning an embedding that maps variable-sized samples of user activity -- ranging from single posts to entire months of activity -- to a vector space, where samples by the same author map to nearby points. The approach does not require human-annotated data for training purposes, which allows us to leverage large amounts of social media content. The proposed model outperforms several competitive baselines under a novel evaluation framework modeled after established recognition benchmarks in other domains. Our method achieves high linking accuracy, even with small samples from accounts not seen at training time, a prerequisite for practical applications of the proposed linking framework.

evaluation, experiment, subreddit, (15 more...)

2105.07263

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Oceania > Australia (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Media (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)