AITopics | domain-specific word

Collaborating Authors

domain-specific word

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Do Slides Help? Multi-modal Context for Automatic Transcription of Conference Talks

Sinhamahapatra, Supriti, Niehues, Jan

arXiv.org Artificial IntelligenceOct-17-2025

State-of-the-art (SOTA) Automatic Speech Recognition (ASR) systems primarily rely on acoustic information while disregarding additional multi-modal context. However, visual information are essential in disambiguation and adaptation. While most work focus on speaker images to handle noise conditions, this work also focuses on integrating presentation slides for the use cases of scientific presentation. In a first step, we create a benchmark for multi-modal presentation including an automatic analysis of transcribing domain-specific terminology. Next, we explore methods for augmenting speech models with multi-modal information. We mitigate the lack of datasets with accompanying slides by a suitable approach of data augmentation. Finally, we train a model using the augmented dataset, resulting in a relative reduction in word error rate of approximately 34%, across all words and 35%, for domain-specific terms compared to the baseline model.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2510.13979

Country:

North America > United States > Minnesota (0.28)
Europe > Germany (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation of Word Embeddings for the Social Sciences

Schiffers, Ricardo, Kern, Dagmar, Hienert, Daniel

arXiv.org Artificial IntelligenceFeb-13-2023

Word embeddings are an essential instrument in many NLP tasks. Most available resources are trained on general language from Web corpora or Wikipedia dumps. However, word embeddings for domain-specific language are rare, in particular for the social science domain. Therefore, in this work, we describe the creation and evaluation of word embedding models based on 37,604 open-access social science research papers. In the evaluation, we compare domain-specific and general language models for (i) language coverage, (ii) diversity, and (iii) semantic relationships. We found that the created domain-specific model, even with a relatively small vocabulary size, covers a large part of social science concepts, their neighborhoods are diverse in comparison to more general models. Across all relation types, we found a more extensive coverage of semantic relationships.

artificial intelligence, natural language, text processing, (17 more...)

arXiv.org Artificial Intelligence

2302.06174

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Germany > Berlin (0.04)
(5 more...)

Genre:

Research Report (0.64)
Overview (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

Add feedback

Working with Sparse Matrix Factorization part2(Machine Learning)

#artificialintelligenceJan-21-2023, 19:05:07 GMT

Abstract: Fast transforms correspond to factorizations of the form Z X(1)…X(J), where each factor X(ℓ) is sparse and possibly structured. This paper investigates essential uniqueness of such factorizations, i.e., uniqueness up to unavoidable scaling ambiguities. Our main contribution is to prove that any N N matrix having the so-called butterfly structure admits an essentially unique factorization into J butterfly factors (where N 2J), and that the factors can be recovered by a hierarchical factorization method, which consists in recursively factorizing the considered matrix into two factors. This hierarchical identifiability property relies on a simple identifiability condition in the two-layer and fixed-support setting. This approach contrasts with existing ones that fit the product of butterfly factors to a given matrix via gradient descent. The proposed method can be applied in particular to retrieve the factorization of the Hadamard or the discrete Fourier transform matrices of size N 2J.

artificial intelligence, data quality, machine learning, (11 more...)

#artificialintelligence

Genre: Research Report (0.60)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.58)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.40)

Add feedback

Analyzing Scientific Publications using Domain-Specific Word Embedding and Topic Modelling

Singhal, Trisha, Liu, Junhua, Blessing, Lucienne T. M., Lim, Kwan Hui

arXiv.org Artificial IntelligenceDec-23-2021

The scientific world is changing at a rapid pace, with new technology being developed and new trends being set at an increasing frequency. This paper presents a framework for conducting scientific analyses of academic publications, which is crucial to monitor research trends and identify potential innovations. This framework adopts and combines various techniques of Natural Language Processing, such as word embedding and topic modelling. Word embedding is used to capture semantic meanings of domain-specific words. We propose two novel scientific publication embedding, i.e., PUB-G and PUB-W, which are capable of learning semantic meanings of general as well as domain-specific words in various research fields. Thereafter, topic modelling is used to identify clusters of research topics within these larger research fields. We curated a publication dataset consisting of two conferences and two journals from 1995 to 2020 from two research domains. Experimental results show that our PUB-G and PUB-W embeddings are superior in comparison to other baseline embeddings by a margin of ~0.18-1.03 based on topic coherence.

dataset, proceedings, publication, (13 more...)

arXiv.org Artificial Intelligence

2112.1294

Country:

Asia > Singapore > Central Region > Singapore (0.04)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.69)
(3 more...)

Add feedback

Topic Driven Adaptive Network for Cross-Domain Sentiment Classification

Zhu, Yicheng, Qiu, Yiqiao, Rao, Yanghui

arXiv.org Artificial IntelligenceNov-28-2021

Cross-domain sentiment classification has been a hot spot these years, which aims to learn a reliable classifier using labeled data from the source domain and evaluate it on the target domain. In this vein, most approaches utilized domain adaptation that maps data from different domains into a common feature space. To further improve the model performance, several methods targeted to mine domain-specific information were proposed. However, most of them only utilized a limited part of domain-specific information. In this study, we first develop a method of extracting domain-specific words based on the topic information. Then, we propose a Topic Driven Adaptive Network (TDAN) for cross-domain sentiment classification. The network consists of two sub-networks: semantics attention network and domain-specific word attention network, the structures of which are based on transformers. These sub-networks take different forms of input and their outputs are fused as the feature vector. Experiments validate the effectiveness of our TDAN on sentiment classification across domains.

classification, domain-specific word, sentiment classification, (15 more...)

arXiv.org Artificial Intelligence

2111.14094

Country: Asia > Middle East > Jordan (0.05)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)

Add feedback

Learning Domain-Sensitive and Sentiment-Aware Word Embeddings

Shi, Bei, Fu, Zihao, Bing, Lidong, Lam, Wai

arXiv.org Artificial IntelligenceMay-9-2018

Word embeddings have been widely used in sentiment classification because of their efficacy for semantic representations of words. Given reviews from different domains, some existing methods for word embeddings exploit sentiment information, but they cannot produce domain-sensitive embeddings. On the other hand, some other existing methods can generate domain-sensitive word embeddings, but they cannot distinguish words with similar contexts but opposite sentiment polarity. We propose a new method for learning domain-sensitive and sentiment-aware embeddings that simultaneously capture the information of sentiment semantics and domain sensitivity of individual words. Our method can automatically determine and produce domain-common embeddings and domain-specific embeddings. The differentiation of domain-common and domain-specific words enables the advantage of data augmentation of common semantics from multiple domains and capture the varied semantics of specific words from different domains at the same time. Experimental results show that our model provides an effective way to learn domain-sensitive and sentiment-aware word embeddings which benefit sentiment classification at both sentence level and lexicon term level.

machine learning, natural language, sentiment classification, (18 more...)

arXiv.org Artificial Intelligence

1805.03801

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(2 more...)

Add feedback

Neural Networks Incorporating Dictionaries for Chinese Word Segmentation

Zhang, Qi (Fudan University) | Liu, Xiaoyu (Fudan University) | Fu, Jinlan (Fudan University)

AAAI ConferencesFeb-8-2018

In recent years, deep neural networks have achieved significant success in Chinese word segmentation and many other natural language processing tasks. Most of these algorithms are end-to-end trainable systems and can effectively process and learn from large scale labeled datasets. However, these methods typically lack the capability of processing rare words and data whose domains are different from training data. Previous statistical methods have demonstrated that human knowledge can provide valuable information for handling rare cases and domain shifting problems. In this paper, we seek to address the problem of incorporating dictionaries into neural networks for the Chinese word segmentation task. Two different methods that extend the bi-directional long short-term memory neural network are proposed to perform the task. To evaluate the performance of the proposed methods, state-of-the-art supervised models based methods and domain adaptation approaches are compared with our methods on nine datasets from different domains. The experimental results demonstrate that the proposed methods can achieve better performance than other state-of-the-art neural network methods and domain adaptation approaches in most cases.

machine learning, natural language, segmentation, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Linking Heterogeneous Input Features with Pivots for Domain Adaptation

Zhou, Guangyou (Central China Normal University) | He, Tingting (Central China Normal University) | Wu, Wensheng (University of Southern California) | Hu, Xiaohua Tony (Central China Normal University)

AAAI ConferencesJul-15-2015

Sentiment classification aims to automatically predict sentiment polarity (e.g., positive or negative) of user generated sentiment data (e.g., reviews, blogs). In real applications, these user generated sentiment data can span so many different domains that it is difficult to manually label training data for all of them. Hence, this paper studies the problem of domain adaptation for sentiment classification where a systemtrained using labeled reviews from a source domain is deployed to classify sentimentsof reviews in a different target domain. In this paper, we propose to link heterogeneous input features with pivots via joint non-negative matrix factorization. This is achieved by learning the domain-specific information from different domains into unified topics, with the help of pivots across all domains. We conduct experiments on a benchmark composed of reviews of 4 types of Amazon products. Experimental results show that our proposed approach significantly outperforms the baseline method, and achieves an accuracy which is competitive with the state-of-the-art methods for sentiment classification adaptation.

adaptation, classification, sentiment classification, (14 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Add feedback