AITopics | nlp task

Collaborating Authors

nlp task

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Glyce: Glyph-vectors for Chinese Character Representations

Yuxian Meng, Wei Wu, Fei Wang, Xiaoya Li, Ping Nie, Fan Yin, Muyu Li, Qinghong Han, Xiaofei Sun, Jiwei Li

Neural Information Processing SystemsFeb-12-2026, 01:04:26 GMT

Neural Information Processing Systems http://nips.cc/

arxiv preprint arxiv, classification task, representation, (13 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > Victoria > Melbourne (0.04)
North America > Canada (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

452bf208bf901322968557227b8f6efe-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-12-2026, 01:04:10 GMT

ablation study, confusion, logographic language, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

Leveraging the Inductive Bias of Large Language Models for Abstract Textual Reasoning

Neural Information Processing SystemsDec-24-2025, 11:17:24 GMT

Large natural language models (LMs) (such as GPT-3 or T5) demonstrate impressive abilities across a range of general NLP tasks. Here, we show that the knowledge embedded in such models provides a useful inductive bias, not just on traditional NLP tasks, but also in the nontraditional task of training a symbolic reasoning engine. We observe that these engines learn quickly and generalize in a natural way that reflects human intuition. For example, training such a system to model block-stacking might naturally generalize to stacking other types of objects because of structure in the real world that has been partially captured by the language describing it. We study several abstract textual reasoning tasks, such as object manipulation and navigation, and demonstrate multiple types of generalization to novel scenarios and the symbols that comprise them. We also demonstrate the surprising utility of $\textit{compositional learning}$, where a learner dedicated to mastering a complicated task gains an advantage by training on relevant simpler tasks instead of jumping straight to the complicated task.

abstract textual reasoning, inductive bias, language model, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.40)

Add feedback

O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers

Neural Information Processing SystemsDec-24-2025, 09:16:33 GMT

Recently, Transformer networks have redefined the state of the art in many NLP tasks. However, these models suffer from quadratic computational cost in the input sequence length $n$ to compute pairwise attention in each layer. This has prompted recent research into sparse Transformers that sparsify the connections in the attention layers. While empirically promising for long sequences, fundamental questions remain unanswered: Can sparse Transformers approximate any arbitrary sequence-to-sequence function, similar to their dense counterparts? How does the sparsity pattern and the sparsity level affect their performance? In this paper, we address these questions and provide a unifying framework that captures existing sparse attention models. We propose sufficient conditions under which we prove that a sparse attention model can universally approximate any sequence-to-sequence function. Surprisingly, our results show that sparse Transformers with only $O(n)$ connections per attention layer can approximate the same function class as the dense model with $n^2$ connections.

name change, transformer, universal approximability, (8 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.60)

Technology: Information Technology > Artificial Intelligence (0.44)

Add feedback

Improving Natural Language Processing Tasks with Human Gaze-Guided Neural Attention

Neural Information Processing SystemsDec-24-2025, 00:11:18 GMT

A lack of corpora has so far limited advances in integrating human gaze data as a supervisory signal in neural attention mechanisms for natural language processing (NLP). We propose a novel hybrid text saliency model (TSM) that, for the first time, combines a cognitive model of reading with explicit human gaze supervision in a single machine learning framework. On four different corpora we demonstrate that our hybrid TSM duration predictions are highly correlated with human gaze ground truth. We further propose a novel joint modeling approach to integrate TSM predictions into the attention layer of a network designed for a specific upstream NLP task without the need for any task-specific human gaze data. We demonstrate that our joint model outperforms the state of the art in paraphrase generation on the Quora Question Pairs corpus by more than 10% in BLEU-4 and achieves state of the art performance for sentence compression on the challenging Google Sentence Compression corpus. As such, our work introduces a practical approach for bridging between data-driven and cognitive models and demonstrates a new way to integrate human gaze-guided neural attention into NLP tasks.

human gaze-guided neural attention, name change, natural language processing task, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Teaching by Failure: Counter-Example-Driven Curricula for Transformer Self-Improvement

Vejendla, Harshil

arXiv.org Artificial IntelligenceDec-2-2025

Transformer models often exhibit brittle extrapolation, failing on inputs that are longer or structurally more complex than those seen during training. We introduce Counter-Example-Driven Curricula (CEDC), an automated framework that improves model robustness by iteratively focusing on its own failures. At each step, CEDC uses the current model to generate a diverse set of candidate problems, employs a fast, executable verifier to identify incorrect predictions (counter-examples), and then fine-tunes the model on a dataset enriched with these discovered failures. We evaluate CEDC on a suite of algorithmic and natural language tasks, including integer addition, sorting, Dyck-2 language recognition, and three text classification benchmarks. Compared to static training and standard curriculum learning baselines, CEDC achieves up to 30x greater length extrapolation, is 3.75x more computationally efficient than uniform data augmentation, and requires no manual difficulty heuristics. We provide a detailed analysis of the counter-examples, showing how the curriculum naturally adapts to target progressively more complex error modes. Our findings establish verifier-guided, failure-driven learning as a simple, powerful, and efficient paradigm for enhancing the generalization capabilities of Transformer models.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2512.01187

Genre:

Instructional Material > Course Syllabus & Notes (0.50)
Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Instruction Tuning With Loss Over Instructions

Neural Information Processing SystemsNov-19-2025, 18:38:58 GMT

Further analysis substantiates our hypothesis that our improvement can be attributed to reduced overfitting to instruction tuning datasets. It is worth noting that we are not proposing IM as a replacement for the current instruction tuning process. Instead, our work aims to provide practical guidance for instruction tuning LMs, especially in low-resource scenarios.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Middle East > Jordan (0.04)
(8 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Modality Matching Matters: Calibrating Language Distances for Cross-Lingual Transfer in URIEL+

Ng, York Hay, Khan, Aditya, Lu, Xiang, Salloum, Matteo, Zhou, Michael, Hoang, Phuong H., Doğruöz, A. Seza, Lee, En-Shiun Annie

arXiv.org Artificial IntelligenceOct-23-2025

Existing linguistic knowledge bases such as URIEL+ provide valuable geographic, genetic and typological distances for cross-lingual transfer but suffer from two key limitations. One, their one-size-fits-all vector representations are ill-suited to the diverse structures of linguistic data, and two, they lack a principled method for aggregating these signals into a single, comprehensive score. In this paper, we address these gaps by introducing a framework for type-matched language distances. We propose novel, structure-aware representations for each distance type: speaker-weighted distributions for geography, hyperbolic embeddings for genealogy, and a latent variables model for typology. We unify these signals into a robust, task-agnostic composite distance. In selecting transfer languages, our representations and composite distances consistently improve performance across a wide range of NLP tasks, providing a more principled and effective toolkit for multilingual research.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.19217

Country: