AITopics

2411.15768

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Republic of Türkiye > Ankara Province > Ankara (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Lian, Xin, Baglodi, Nishant, MacLellan, Christopher J.

Incremental and Data-Efficient Concept Formation to Support Masked Word Prediction

arXiv.org Artificial IntelligenceSep-18-2024

This paper introduces Cobweb4L, a novel approach for efficient language model learning that supports masked word prediction. The approach builds on Cobweb, an incremental system that learns a hierarchy of probabilistic concepts. Each concept stores the frequencies of words that appear in instances tagged with that concept label. The system utilizes an attribute value representation to encode words and their surrounding context into instances. Cobweb4L uses the information theoretic variant of category utility and a new performance mechanism that leverages multiple concepts to generate predictions. We demonstrate that with these extensions it significantly outperforms prior Cobweb performance mechanisms that use only a single node to generate predictions. Further, we demonstrate that Cobweb4L learns rapidly and achieves performance comparable to and even superior to Word2Vec. Next, we show that Cobweb4L and Word2Vec outperform BERT in the same task with less training data. Finally, we discuss future work to make our conclusions more robust and inclusive.

language model, node, prediction, (16 more...)

2409.1244

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Ali, Wazir, Tumrani, Saifullah, Kumar, Jay, Soomro, Tariq Rahim

An Evaluation of Sindhi Word Embedding in Semantic Analogies and Downstream Tasks

arXiv.org Artificial IntelligenceAug-28-2024

In this paper, we propose a new word embedding based corpus consisting of more than 61 million words crawled from multiple web resources. We design a preprocessing pipeline for the filtration of unwanted text from crawled data. Afterwards, the cleaned vocabulary is fed to state-of-the-art continuous-bag-of-words, skip-gram, and GloVe word embedding algorithms. For the evaluation of pretrained embeddings, we use popular intrinsic and extrinsic evaluation approaches. The evaluation results reveal that continuous-bag-of-words and skip-gram perform better than GloVe and existing Sindhi fastText word embedding on both intrinsic and extrinsic evaluation approaches.

corpus, machine learning, natural language, (19 more...)

2408.1572

Country:

Asia > Afghanistan > Kabul Province > Kabul (0.04)
Asia > Pakistan > Sindh > Karachi Division > Karachi (0.04)
Asia > China > Beijing > Beijing (0.04)
(12 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Wibisono, Kevin Christian, Wang, Yixin

Bidirectional Attention as a Mixture of Continuous Word Experts

arXiv.org Machine LearningDec-11-2023

Bidirectional attention $\unicode{x2013}$ composed of self-attention with positional encodings and the masked language model (MLM) objective $\unicode{x2013}$ has emerged as a key component of modern large language models (LLMs). Despite its empirical success, few studies have examined its statistical underpinnings: What statistical model is bidirectional attention implicitly fitting? What sets it apart from its non-attention predecessors? We explore these questions in this paper. The key observation is that fitting a single-layer single-head bidirectional attention, upon reparameterization, is equivalent to fitting a continuous bag of words (CBOW) model with mixture-of-experts (MoE) weights. Further, bidirectional attention with multiple heads and multiple layers is equivalent to stacked MoEs and a mixture of MoEs, respectively. This statistical viewpoint reveals the distinct use of MoE in bidirectional attention, which aligns with its practical effectiveness in handling heterogeneous data. It also suggests an immediate extension to categorical tabular data, if we view each word location in a sentence as a tabular feature. Across empirical studies, we find that this extension outperforms existing tabular extensions of transformers in out-of-distribution (OOD) generalization. Finally, this statistical perspective of bidirectional attention enables us to theoretically characterize when linear word analogies are present in its word embeddings. These analyses show that bidirectional attention can require much stronger assumptions to exhibit linear word analogies than its non-attention predecessors.

analogy, bidirectional attention, objective, (16 more...)

arXiv.org Machine Learning

2307.04057

Country:

North America > United States > Michigan (0.04)
Europe > Greece (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Iraq > Baghdad Governorate > Baghdad (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Rahimikia, Eghbal, Zohren, Stefan, Poon, Ser-Huang

Realised Volatility Forecasting: Machine Learning via Financial Word Embedding

arXiv.org Artificial IntelligenceMar-1-2023

This study develops FinText, a financial word embedding compiled from 15 years of business news archives. The results show that FinText produces substantially more accurate results than general word embeddings based on the gold-standard financial benchmark we introduced. In contrast to well-known econometric models, and over the sample period from 27 July 2007 to 27 January 2022 for 23 NASDAQ stocks, using stock-related news, our simple natural language processing model supported by different word embeddings improves realised volatility forecasts on high volatility days. This improvement in realised volatility forecasting performance switches to normal volatility days when general hot news is used. By utilising SHAP, an Explainable AI method, we also identify and classify key phrases in stock-related and general hot news that moved volatility.

fintext, machine learning, natural language, (21 more...)

doi: 10.2139/ssrn.3895272

2108.0048

Country:

Asia > North Korea (0.28)
Asia > China (0.04)
Asia > Japan (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Media > News (1.00)
Information Technology (1.00)
Health & Medicine (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.92)

#artificialintelligenceAug-23-2022, 02:29:42 GMT

Word2Vec

Word2Vec is a Two Layer Neural Network based Continuous bag of word (CBOW) and Skip-gram architecture that captures the semantic information. It generates the word embedding (mapping of words in a vector space) for a given text corpus. It converts the words into vectors and vectors performs an operation like add, subtract, calculating distance, etc. which preserves the relationship among the words. How are the relationships among words are formed? Word2Vec assigns similar vector representation to the similar words.

context word, vector, word2vec, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

#artificialintelligenceAug-2-2022, 13:30:46 GMT

Theory Behind the Basics of NLP - Analytics Vidhya

This article was published as a part of the Data Science Blogathon. Natural Language Processing (NLP) can help you to understand any text's sentiments. This is helpful for people to understand the emotions and the type of text they are looking over. Negative and Positive comments can be easily differentiated. NLP wanted to make machines understand the text or comment the same way humans can.

corpus, frequency, vocabulary, (15 more...)

Country: Europe > Holy See > Vatican City (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.31)

#artificialintelligenceSep-26-2021, 07:05:11 GMT

How to Use Arabic Word2Vec Word Embedding with LSTM

Word embedding is the approach of learning word and their relative meanings from a corpus of text and representing the word as a dense vector. The word vector is the projection of the word into a continuous feature vector space, see Figure 1 (A) for clarity. Words that have similar meaning should be close together in the vector space as illustrated in see Figure 1 (B). Word2vec is one of the most popular words embedding in NLP. Word2vec has two types, Continuous Bag-of-Words Model (CBOW) and Continuous Skip-gram Model [3], the model architectures are shown in Figure 2. CBOW predicts the word according to the given context, where Skip-gram predicts the context according to the given word, which increases the computational complexity [3].

sequence, use arabic word2vec word embedding, word2vec word embedding, (8 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.81)

#artificialintelligenceAug-31-2021, 14:18:34 GMT

Word Embeddings

A word embedding is a representation of a word as a vector, or sequence of numbers. Often times these vectors encode how the word is used in conjunction with other words in a dataset. Both the technique for encoding and the dataset used can vary greatly and ultimately depends on the appropriate use case. Word embeddings have ubiquitous use cases in NLP/ML, and allow computers or mathematical equations to reason about words. Computers only see words as a sequence of individual characters, which is not often useful when reasoning about the semantic or syntactic usage of a word in a language.

context word, vector, word embedding, (12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.84)

İrsoy, Ozan, Benton, Adrian, Stratos, Karl

k\=oan: A Corrected CBOW Implementation

arXiv.org Machine LearningDec-30-2020

It is a common belief in the NLP community that continuous bag-of-words (CBOW) word embeddings tend to underperform skip-gram (SG) embeddings. We find that this belief is founded less on theoretical differences in their training objectives but more on faulty CBOW implementations in standard software libraries such as the official implementation word2vec.c and Gensim. We show that our correct implementation of CBOW yields word embeddings that are fully competitive with SG on various intrinsic and extrinsic tasks while being more than three times as fast to train. We release our implementation, k\=oan, at https://github.com/bloomberg/koan.

cbow, gradient, implementation, (14 more...)

arXiv.org Machine Learning

2012.15332

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Middle East > Malta > Port Region > Southern Harbour District > Valletta (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.91)