AITopics | pos tag

Collaborating Authors

pos tag

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Slovak Conceptual Dictionary

Blšták, Miroslav

arXiv.org Artificial IntelligenceDec-2-2025

When solving tasks in the field of natural language processing, we sometimes need dictionary tools, such as lexicons, word form dictionaries or knowledge bases. However, the availability of dictionary data is insufficient in many languages, especially in the case of low resourced languages. In this article, we introduce a new conceptual dictionary for the Slovak language as the first linguistic tool of this kind. Since Slovak language is a language with limited linguistic resources and there are currently not available any machine-readable linguistic data sources with a sufficiently large volume of data, many tasks which require automated processing of Slovak text achieve weaker results compared to other languages and are almost impossible to solve.

artificial intelligence, natural language, text processing, (17 more...)

arXiv.org Artificial Intelligence

2512.00579

Country: Europe > Austria (0.28)

Genre: Research Report (0.50)

Industry: Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

On the Analogy between Human Brain and LLMs: Spotting Key Neurons in Grammar Perception

Norouzi, Sanaz Saki, Masjedi, Mohammad, Hitzler, Pascal

arXiv.org Artificial IntelligenceNov-11-2025

Artificial Neural Networks, the building blocks of AI, were inspired by the human brain's network of neurons. Over the years, these networks have evolved to replicate the complex capabilities of the brain, allowing them to handle tasks such as image and language processing. In the realm of Large Language Models, there has been a keen interest in making the language learning process more akin to that of humans. While neuroscientific research has shown that different grammatical categories are processed by different neurons in the brain, we show that LLMs operate in a similar way. Utilizing Llama 3, we identify the most important neurons associated with the prediction of words belonging to different part-of-speech tags. Using the achieved knowledge, we train a classifier on a dataset, which shows that the activation patterns of these key neurons can reliably predict part-of-speech tags on fresh data. The results suggest the presence of a subspace in LLMs focused on capturing part-of-speech tag concepts, resembling patterns observed in lesion studies of the brain in neuroscience.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2511.06519

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

LingoLoop Attack: Trapping MLLMs via Linguistic Context and State Entrapment into Endless Loops

Fu, Jiyuan, Jiang, Kaixun, Hong, Lingyi, Li, Jinglun, Guo, Haijing, Yang, Dingkang, Chen, Zhaoyu, Zhang, Wenqiang

arXiv.org Artificial IntelligenceJun-18-2025

Multimodal Large Language Models (MLLMs) have shown great promise but require substantial computational resources during inference. Attackers can exploit this by inducing excessive output, leading to resource exhaustion and service degradation. Prior energy-latency attacks aim to increase generation time by broadly shifting the output token distribution away from the EOS token, but they neglect the influence of token-level Part-of-Speech (POS) characteristics on EOS and sentence-level structural patterns on output counts, limiting their efficacy. To address this, we propose LingoLoop, an attack designed to induce MLLMs to generate excessively verbose and repetitive sequences. First, we find that the POS tag of a token strongly affects the likelihood of generating an EOS token. Based on this insight, we propose a POS-Aware Delay Mechanism to postpone EOS token generation by adjusting attention weights guided by POS information. Second, we identify that constraining output diversity to induce repetitive loops is effective for sustained generation. We introduce a Generative Path Pruning Mechanism that limits the magnitude of hidden states, encouraging the model to produce persistent loops. Extensive experiments demonstrate LingoLoop can increase generated tokens by up to 30 times and energy consumption by a comparable factor on models like Qwen2.5-VL-3B, consistently driving MLLMs towards their maximum generation limits. These findings expose significant MLLMs' vulnerabilities, posing challenges for their reliable deployment. The code will be released publicly following the paper's acceptance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2506.14493

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
(9 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)

Add feedback

Symbolic Prompt Program Search: A Structure-Aware Approach to Efficient Compile-Time Prompt Optimization

Schnabel, Tobias, Neville, Jennifer

arXiv.org Artificial IntelligenceJun-27-2024

In many modern LLM applications, such as retrieval augmented generation, prompts have become programs themselves. In these settings, prompt programs are repeatedly called with different user queries or data instances. A big practical challenge is optimizing such prompt programs. Recent work has mostly focused on either simple prompt programs or assumed that the general structure of a prompt program is fixed. We introduce SAMMO, a framework to perform symbolic prompt program search for compile-time optimizations of prompt programs. SAMMO represents prompt programs on a symbolic level which allows for a rich set of transformations that can be searched over during optimization. We show that SAMMO generalizes previous methods and improves the performance of complex prompts on (1) instruction tuning, (2) RAG pipeline tuning, and (3) prompt compression, across several different LLMs. We make all code available open-source at https://github.com/microsoft/sammo .

instruction, pos tag, prompt program, (16 more...)

arXiv.org Artificial Intelligence

2404.02319

Country:

North America > United States > New York (0.05)
South America > Brazil (0.05)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.40)

Industry: Leisure & Entertainment (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

ToPro: Token-Level Prompt Decomposition for Cross-Lingual Sequence Labeling Tasks

Ma, Bolei, Nie, Ercong, Yuan, Shuzhou, Schmid, Helmut, Färber, Michael, Kreuter, Frauke, Schütze, Hinrich

arXiv.org Artificial IntelligenceJan-29-2024

Prompt-based methods have been successfully applied to multilingual pretrained language models for zero-shot cross-lingual understanding. However, most previous studies primarily focused on sentence-level classification tasks, and only a few considered token-level labeling tasks such as Named Entity Recognition (NER) and Part-of-Speech (POS) tagging. In this paper, we propose Token-Level Prompt Decomposition (ToPro), which facilitates the prompt-based method for token-level sequence labeling tasks. The ToPro method decomposes an input sentence into single tokens and applies one prompt template to each token. Our experiments on multilingual NER and POS tagging datasets demonstrate that ToPro-based fine-tuning outperforms Vanilla fine-tuning and Prompt-Tuning in zero-shot cross-lingual transfer, especially for languages that are typologically different from the source language English. Our method also attains state-of-the-art performance when employed with the mT5 model. Besides, our exploratory study in multilingual large language models shows that ToPro performs much better than the current in-context learning method. Overall, the performance improvements show that ToPro could potentially serve as a novel and simple benchmarking method for sequence labeling tasks.

computational linguistic, linguistic, vanilla, (14 more...)

arXiv.org Artificial Intelligence

2401.16589

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.04)
North America > Dominican Republic (0.04)
(16 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.86)

Add feedback

BanLemma: A Word Formation Dependent Rule and Dictionary Based Bangla Lemmatizer

Afrin, Sadia, Chowdhury, Md. Shahad Mahmud, Islam, Md. Ekramul, Khan, Faisal Ahamed, Chowdhury, Labib Imam, Mahtab, MD. Motahar, Chowdhury, Nazifa Nuha, Forkan, Massud, Kundu, Neelima, Arif, Hakim, Rashid, Mohammad Mamun Or, Amin, Mohammad Ruhul, Mohammed, Nabeel

arXiv.org Artificial IntelligenceNov-6-2023

Lemmatization holds significance in both natural language processing (NLP) and linguistics, as it effectively decreases data density and aids in comprehending contextual meaning. However, due to the highly inflected nature and morphological richness, lemmatization in Bangla text poses a complex challenge. In this study, we propose linguistic rules for lemmatization and utilize a dictionary along with the rules to design a lemmatizer specifically for Bangla. Our system aims to lemmatize words based on their parts of speech class within a given sentence. Unlike previous rule-based approaches, we analyzed the suffix marker occurrence according to the morpho-syntactic values and then utilized sequences of suffix markers instead of entire suffixes. To develop our rules, we analyze a large corpus of Bangla text from various domains, sources, and time periods to observe the word formation of inflected words. The lemmatizer achieves an accuracy of 96.36% when tested against a manually annotated test dataset by trained linguists and demonstrates competitive performance on three previously published Bangla lemmatization datasets. We are making the code and datasets publicly available at https://github.com/eblict-gigatech/BanLemma in order to contribute to the further advancement of Bangla NLP.

dataset, lemma, lemmatizer, (17 more...)

arXiv.org Artificial Intelligence

2311.03078

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
Asia > Indonesia > Bali (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(7 more...)

Genre: Research Report > New Finding (0.66)

Industry: Media > News (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.89)

Add feedback

Another Dead End for Morphological Tags? Perturbed Inputs and Parsing

Muñoz-Ortiz, Alberto, Vilares, David

arXiv.org Artificial IntelligenceMay-24-2023

The usefulness of part-of-speech tags for parsing has been heavily questioned due to the success of word-contextualized parsers. Yet, most studies are limited to coarse-grained tags and high quality written content; while we know little about their influence when it comes to models in production that face lexical errors. We expand these setups and design an adversarial attack to verify if the use of morphological information by parsers: (i) contributes to error propagation or (ii) if on the other hand it can play a role to correct mistakes that word-only neural parsers make. The results on 14 diverse UD treebanks show that under such attacks, for transition- and graph-based models their use contributes to degrade the performance even faster, while for the (lower-performing) sequence labeling parsers they are helpful. We also show that if morphological tags were utopically robust against lexical perturbations, they would be able to correct parsing mistakes.

artificial intelligence, natural language, parser, (17 more...)

arXiv.org Artificial Intelligence

2305.15119

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(14 more...)

Genre: Research Report (0.50)

Industry: Government (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback

Parsing linearizations appreciate PoS tags - but some are fussy about errors

Muñoz-Ortiz, Alberto, Anderson, Mark, Vilares, David, Gómez-Rodríguez, Carlos

arXiv.org Artificial IntelligenceOct-27-2022

PoS tags, once taken for granted as a useful resource for syntactic parsing, have become more situational with the popularization of deep learning. Recent work on the impact of PoS tags on graph- and transition-based parsers suggests that they are only useful when tagging accuracy is prohibitively high, or in low-resource scenarios. However, such an analysis is lacking for the emerging sequence labeling parsing paradigm, where it is especially relevant as some models explicitly use PoS tags for encoding and decoding. We undertake a study and uncover some trends. Among them, PoS tags are generally more useful for sequence labeling parsers than for other paradigms, but the impact of their accuracy is highly encoding-dependent, with the PoS-based head-selection encoding being best only when both tagging accuracy and resource availability are high.

accuracy, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2210.15219

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > Wales > Cardiff (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Tag-Set-Sequence Learning for Generating Question-Answer Pairs

Zhang, Cheng, Wang, Jie

arXiv.org Artificial IntelligenceOct-20-2022

Transformer-based QG models can generate question-answer pairs (QAPs) with high qualities, but may also generate silly questions for certain texts. We present a new method called tag-set sequence learning to tackle this problem, where a tag-set sequence is a sequence of tag sets to capture the syntactic and semantic information of the underlying sentence, and a tag set consists of one or more language feature tags, including, for example, semantic-role-labeling, part-of-speech, named-entity-recognition, and sentiment-indication tags. We construct a system called TSS-Learner to learn tag-set sequences from given declarative sentences and the corresponding interrogative sentences, and derive answers to the latter. We train a TSS-Learner model for the English language using a small training dataset and show that it can indeed generate adequate QAPs for certain texts that transformer-based models do poorly. Human evaluation on the QAPs generated by TSS-Learner over SAT practice reading tests is encouraging.

interrogative sentence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2210.11608

Country:

North America > United States > Massachusetts > Middlesex County > Lowell (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Colorado (0.04)
(7 more...)

Genre: Research Report (0.50)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Part of Speech Tagging

#artificialintelligenceJan-21-2022, 05:07:46 GMT

Part of Speech (POS) is a way to describe the grammatical function of a word. In Natural Language Processing (NLP), POS is an essential building block of language models and interpreting text. While POS tags are used in higher-level functions of NLP, it's important to understand them on their own, and it's possible to leverage them for useful purposes in your text analysis. There are eight (sometimes nine) different parts of speech in English that are commonly defined. Noun: A noun is the name of a person, place, thing, or idea.

customer, library, nlp library, (13 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.90)

Add feedback