AITopics | sentence pair

Collaborating Authors

sentence pair

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SITUATEDGEN: Incorporating Geographical and Temporal Contexts into Generative Commonsense Reasoning

Neural Information Processing SystemsApr-29-2026, 21:50:28 GMT

Recently, commonsense reasoning in text generation has attracted much attention. Generative commonsense reasoning is the task that requires machines, given a group of keywords, to compose a single coherent sentence with commonsense plausibility. While existing datasets targeting generative commonsense reasoning focus on everyday scenarios, it is unclear how well machines reason under specific geographical and temporal contexts. We formalize this challenging task as SITUATEDGEN, where machines with commonsense should generate a pair of contrastive sentences given a group of keywords including geographical or temporal entities. We introduce a corresponding English dataset consisting of 8,268 contrastive sentence pairs, which are built upon several existing commonsense reasoning benchmarks with minimal manual labor. Experiments show that state-of-the-art generative language models struggle to generate sentences with commonsense plausibility and still lag far behind human performance.

artificial intelligence, computational linguistic, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Michigan (0.28)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

EMMA-X: An EM-like Multilingual Pre-training Algorithm for Cross-lingual Representation Learning

Neural Information Processing SystemsApr-25-2026, 18:37:27 GMT

Expressing universal semantics common to all languages is helpful in understanding the meanings of complex and culture-specific sentences. The research theme underlying this scenario focuses on learning universal representations across languages with the usage of massive parallel corpora. However, due to the sparsity and scarcity of parallel data, there is still a big challenge in learning authentic "universals" for any two languages. In this paper, we propose EMMA-X: an EM-like Multilingual pre-training Algorithm, to learn (X)Cross-lingual universals with the aid of excessive multilingual non-parallel data.

computational linguistic, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States > Minnesota (0.28)

Industry:

Government (0.67)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

0f6cc80ad86e553d085842308e0fd2cb-Paper-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 14:34:56 GMT

data mining, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Europe (0.67)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Communications (0.93)
Information Technology > Data Science > Data Mining (0.69)

Add feedback

A Datasheet for S

Neural Information Processing SystemsFeb-17-2026, 07:57:41 GMT

These sentence pairs will serve as a means to evaluate machines' commonsense reasoning abilities under different extra-linguistic contexts.

artificial intelligence, commonsense reasoning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.05)
Asia > China (0.05)
North America > United States > Texas (0.04)
(6 more...)

Industry:

Law (0.68)
Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.49)

Add feedback

Incorporating Geographical and Temporal Contexts into Generative Commonsense Reasoning

Neural Information Processing SystemsFeb-17-2026, 07:57:38 GMT

artificial intelligence, computational linguistic, natural language, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
Oceania > Australia (0.06)
(18 more...)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Appendix for Data Diversification: A Simple Strategy For Neural Machine Translation Xuan-Phi Nguyen

Neural Information Processing SystemsFeb-8-2026, 21:46:07 GMT

Finally, we describe the training setup for our back-translation experiments. We continue to differentiate our method from other existing works. Our method does not train multiple peer models with EM training either. In each round, a forward (or backward) model takes turn to play the "back-translation" role to train The role is switched in the next round. In other words, source and target are identical.

artificial intelligence, experiment, natural language, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada (0.04)
Europe > Germany > Berlin (0.04)
(4 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

201408406e0c5cf7626c4baeae6eaadd-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 18:53:10 GMT

computational linguistic, proceedings, representation, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(21 more...)

Industry:

Government (0.67)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

GAPX: GeneralizedAutoregressive Paraphrase-IdentificationX

Neural Information Processing SystemsFeb-7-2026, 11:39:37 GMT

Paraphrases are sentences or phrases that convey the same meaning using different wording, and is fundamental to the understanding of languages [7]. Paraphrase Identification is a well-studied task of identifying if a given pair of sentences has the same meaning [51, 47, 56, 57, 31], and has many important downstream applications such as machine translation [61, 44, 40, 27], and question-answering[11,35].

machine learning, natural language, urlhttp, (19 more...)

Neural Information Processing Systems

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Sweden > Uppsala County > Uppsala (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > China (0.04)

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Enhancing OCR for Sino-Vietnamese Language Processing via Fine-tuned PaddleOCRv5

Nguyen, Minh Hoang, Thiet, Su Nguyen

arXiv.org Artificial IntelligenceDec-2-2025

Recognizing and processing Classical Chinese (Han-Nom) texts play a vital role in digitizing Vietnamese historical documents and enabling cross-lingual semantic research. However, existing OCR systems struggle with degraded scans, non-standard glyphs, and handwriting variations common in ancient sources. In this work, we propose a fine-tuning approach for PaddleOCRv5 to improve character recognition on Han-Nom texts. We retrain the text recognition module using a curated subset of ancient Vietnamese Chinese manuscripts, supported by a full training pipeline covering preprocessing, LMDB conversion, evaluation, and visualization. Experimental results show a significant improvement over the base model, with exact accuracy increasing from 37.5 percent to 50.0 percent, particularly under noisy image conditions. Furthermore, we develop an interactive demo that visually compares pre- and post-fine-tuning recognition results, facilitating downstream applications such as Han-Vietnamese semantic alignment, machine translation, and historical linguistics research. The demo is available at https://huggingface.co/spaces/MinhDS/Fine-tuned-PaddleOCRv5

machine learning, natural language, pattern recognition, (18 more...)

arXiv.org Artificial Intelligence

2510.04003

Country: Asia > Vietnam (0.30)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.91)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.73)

Add feedback

A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders

Wang, Yizheng, Jiang, Tao, Bai, Jinyan, Zou, Zhengbin, Xue, Tiancheng, Zhang, Nan, Luan, Jie

arXiv.org Artificial IntelligenceDec-1-2025

Software Requirement Document (RD) typically contain tens of thousands of individual requirements, and ensuring consistency among these requirements is critical for the success of software engineering projects. Automated detection methods can significantly enhance efficiency and reduce costs; however, existing approaches still face several challenges, including low detection accuracy on imbalanced data, limited semantic extraction due to the use of a single encoder, and suboptimal performance in cross-domain transfer learning. To address these issues, this paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS. First, the framework employs two independent encoders, Sentence-BERT (SBERT) and Simple Contrastive Sentence Embedding (SimCSE), to generate sentence embeddings for requirement pairs, followed by a six-element concatenation strategy. Furthermore, the classifier is enhanced by a two-layer fully connected feedforward neural network (FFNN) with a hybrid loss optimization strategy that integrates a variant of Focal Loss, domain-specific constraints, and a confidence-based penalty term. Finally, the framework synergistically integrates sequential and cross-domain transfer learning. Experimental results demonstrate that the proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2511.23007

Country: