AITopics

2404.14631

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > Promising Solution (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Ri, Narutatsu, Lee, Fei-Tzin, Verma, Nakul

Contrastive Loss is All You Need to Recover Analogies as Parallel Lines

arXiv.org Artificial IntelligenceJun-13-2023

While static word embedding models are known to represent linguistic analogies as parallel lines in high-dimensional space, the underlying mechanism as to why they result in such geometric structures remains obscure. We find that an elementary contrastive-style method employed over distributional information performs competitively with popular word embedding models on analogy recovery tasks, while achieving dramatic speedups in training time. Further, we demonstrate that a contrastive loss is sufficient to create these parallel structures in word embeddings, and establish a precise relationship between the co-occurrence statistics and the geometric structure of the resulting word embeddings.

analogy, machine learning, natural language, (17 more...)

2306.08221

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > United States > California (0.04)
(12 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMay-30-2021

Robust Dynamic Network Embedding via Ensembles

Hou, Chengbin, Fu, Guoji, Yang, Peng, He, Shan, Tang, Ke

Dynamic Network Embedding (DNE) has recently attracted considerable attention due to the advantage of network embedding in various applications and the dynamic nature of many real-world networks. For dynamic networks, the degree of changes, i.e., defined as the averaged number of changed edges between consecutive snapshots spanning a dynamic network, could be very different in real-world scenarios. Although quite a few DNE methods have been proposed, it still remains unclear that whether and to what extent the existing DNE methods are robust to the degree of changes, which is however an important factor in both academic research and industrial applications. In this work, we investigate the robustness issue of DNE methods w.r.t. the degree of changes for the first time and accordingly, propose a robust DNE method. Specifically, the proposed method follows the notion of ensembles where the base learner adopts an incremental Skip-Gram neural embedding approach. To further boost the performance, a novel strategy is proposed to enhance the diversity among base learners at each timestep by capturing different levels of local-global topology. Extensive experiments demonstrate the benefits of special designs in the proposed method, and the superior performance of the proposed method compared to state-of-the-art methods. The comparative study also reveals the robustness issue of some DNE methods. The source code is available at https://github.com/houchengbin/SG-EDNE

dne method, dynamic network, node, (13 more...)

2105.14557

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Information Technology (0.68)
Education > Educational Setting (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

#artificialintelligenceNov-30-2020, 21:19:25 GMT

Word Embeddings in High-Level

The most common representation of words in NLP tasks is the One Hot Encoding. Below we can see an example of One Hot Encoding for the words "Cat" and "Dog". As we can see, these two vectors are independent since their inner product is 0, and their Euclidean distance is the square root of 2. Notice that this applies to every pair in the vocabulary, meaning that every pair of words are independent, and their distance is the square root of 2. Notice that this applies to every pair in the vocabulary, meaning that every pair of words are independent, and their distance is \(\sqrt(2)\). For example, the words below are considered independent, and the distance -- similarity between any pair of words is the same. This is an issue for NLP tasks since we want to be able to capture the relation between words.

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.05)
Europe > Greece (0.05)
Europe > France (0.05)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.31)

#artificialintelligenceOct-28-2019, 02:48:35 GMT

Using Word2Vec for Better Embeddings of Categorical Features

Back in 2012, when neural networks regained popularity, people were excited about the possibility of training models without having to worry about feature engineering. Indeed, most of the earliest breakthroughs were in computer vision, in which raw pixels were used as input for networks. Soon enough it turned out that if you wanted to use textual data, clickstream data, or pretty much any data with categorical features, at some point you'd have to ask yourself -- how do I represent my categorical features as vectors that my network can work with? The most popular approach is embedding layers -- you add an extra layer to your network, which assigns a vector to each value of the categorical feature. During training the network learns the weights for the different layers, including those embeddings.

advertiser, categorical feature, vector, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Almagro-Blanco, Pedro, Sancho-Caparrini, Fernando

Improving Skip-Gram based Graph Embeddings via Centrality-Weighted Sampling

arXiv.org Machine LearningJul-20-2019

Network embedding techniques inspired by word2vec represent an effective unsupervised relational learning model. Commonly, by means of a Skip-Gram procedure, these techniques learn low dimensional vector representations of the nodes in a graph by sampling node-context examples. Although many ways of sampling the context of a node have been proposed, the effects of the way a node is chosen have not been analyzed in depth. To fill this gap, we have re-implemented the main four word2vec inspired graph embedding techniques under the same framework and analyzed how different sampling distributions affects embeddings performance when tested in node classification problems. We present a set of experiments on different well known real data sets that show how the use of popular centrality distributions in sampling leads to improvements, obtaining speeds of up to 2 times in learning times and increasing accuracy in all cases.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Machine Learning

1907.08793

Country:

Europe > Spain (0.28)
Oceania > Australia (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

#artificialintelligenceJun-14-2019, 15:59:13 GMT

Word2vec Made Easy

This post is a simplified yet in-depth guide to word2vec. In this article, we will implement word2vec model from scratch and see how embedding help to find similar/dissimilar words. Word2Vec is the foundation of NLP( Natural Language Processing). Tomas Mikolov and the team of researchers developed the technique in 2013 at Google. Their approach first published in the paper'Efficient Estimation of Word Representations in Vector Space'.

machine learning, natural language, vector, (5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Alshargi, Faisal, Shekarpour, Saeedeh, Soru, Tommaso, Sheth, Amit

Concept2vec: Metrics for Evaluating Quality of Embeddings for Ontological Concepts

arXiv.org Artificial IntelligenceJul-26-2018

Although there is an emerging trend towards generating embeddings for primarily unstructured data, and recently for structured data, there is not yet any systematic suite for measuring the quality of embeddings. This deficiency is further sensed with respect to embeddings generated for structured data because there are no concrete evaluation metrics measuring the quality of encoded structure as well as semantic patterns in the embedding space. In this paper, we introduce a framework containing three distinct tasks concerned with the individual aspects of ontological concepts: (i) the categorization aspect, (ii) the hierarchical aspect, and (iii) the relational aspect. Then, in the scope of each task, a number of intrinsic metrics are proposed for evaluating the quality of the embeddings. Furthermore, w.r.t. this framework multiple experimental studies were run to compare the quality of the available embedding models. Employing this framework in future research can reduce misjudgment and provide greater insight about quality comparisons of embeddings for ontological concepts.

data mining, machine learning, natural language, (22 more...)

1803.04488

Country:

Europe > Germany > Saxony > Leipzig (0.04)
North America > United States > Nevada (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

#artificialintelligenceAug-1-2017, 03:24:15 GMT

A non-NLP application of Word2Vec – Towards Data Science – Medium

The above is exactly what Word2Vec seeks to do: it tries to determine the meaning of a word by analyzing its neighboring words (also called context). The algorithm exists in two flavors CBOW and Skip-Gram. Given a set of sentences (also called corpus) the model loops on the words of each sentence and either tries to use the current word of to predict its neighbors (its context), in which case the method is called "Skip-Gram", or it uses each of these contexts to predict the current word, in which case the method is called "Continuous Bag Of Words" (CBOW). The limit on the number of words in each context is determined by a parameter called "window size". So if we choose for example the Skip-Gram method, Word2Vec then consists of using a shallow neural network, i.e. a neural network of only one hidden layer, to learn the word embedding. The network first initializes randomly its weights then iteratively adapt these during training to minimize the error it makes when using words to predict their contexts.

machine learning, natural language, word2vec, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Landgraf, Andrew J., Bellay, Jeremy

word2vec Skip-Gram with Negative Sampling is a Weighted Logistic PCA

arXiv.org Machine LearningMay-26-2017

Mikolov et al. (2013) introduced the skip-gram formulation for neural word embeddings, wherein one tries to predict the context of a given word. Their negative-sampling algorithm improved the computational feasibility of training the embeddings. Due to their state-of-the-art performance on a number of tasks, there has been much research aimed at better understanding it. Goldberg and Levy (2014) showed that skip-gram with negative-sampling algorithm (SGNS) maximizes a different likelihood than the skip-gram formulation poses and further showed how it is implicitly related to pointwise mutual information (Levy and Goldberg, 2014). We show that SGNS is a weighted logistic PCA, which is a special case of exponential family PCA for the binomial likelihood. Cotterell et al. (2017) showed that the skip-gram formulation can be viewed as exponential family PCA with a multinomial likelihood, but they did not make the connection between the negative-sampling algorithm and the binomial likelihood. Li et al. (2015) showed that SGNS is an explicit matrix factorization related to representation learning, but the matrix factorization objective they found was complicated and they did not find the connection to the binomial distribution or exponential family PCA.

artificial intelligence, factorization, machine learning, (16 more...)

arXiv.org Machine Learning

1705.09755

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)