AITopics | categorisation

Collaborating Authors

categorisation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Toward Adaptive Categories: Dimensional Governance for Agentic AI

Engin, Zeynep, Hand, David

arXiv.org Artificial IntelligenceNov-25-2025

As AI systems evolve from static tools to dynamic agents, traditional categorical governance frameworks -- based on fixed risk tiers, levels of autonomy, or human oversight models -- are increasingly insufficient on their own. Systems built on foundation models, self-supervised learning, and multi-agent architectures increasingly blur the boundaries that categories were designed to police. In this Perspective, we make the case for dimensional governance: a framework that tracks how decision authority, process autonomy, and accountability (the 3As) distribute dynamically across human-AI relationships. A critical advantage of this approach is its ability to explicitly monitor system movement toward and across key governance thresholds, enabling preemptive adjustments before risks materialize. This dimensional approach provides the necessary foundation for more adaptive categorization, enabling thresholds and classifications that can evolve with emerging capabilities. While categories remain essential for decision-making, building them upon dimensional foundations allows for context-specific adaptability and stakeholder-responsive governance that static approaches cannot achieve. We outline key dimensions, critical trust thresholds, and practical examples illustrating where rigid categorical frameworks fail -- and where a dimensional mindset could offer a more resilient and future-proof path forward for both governance and innovation at the frontier of artificial intelligence.

artificial intelligence, governance, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2505.11579

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.50)

Industry:

Law (1.00)
Government (1.00)
Banking & Finance (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Add feedback

Beyond the Link: Assessing LLMs' ability to Classify Political Content across Global Media

De La Fuente-Cuesta, Alejandro, Martinez-Serra, Alberto, Visscher, Nienke, Castro, Laia, Cardenal, Ana S.

arXiv.org Artificial IntelligenceNov-5-2025

The use of large language models (LLMs) is becoming common in political science and digital media research. While LLMs have demonstrated ability in labelling tasks, their effectiveness to classify Political Content (PC) from URLs remains underexplored. This article evaluates whether LLMs can accurately distinguish PC from non-PC using both the text and the URLs of news articles across five countries (France, Germany, Spain, the UK, and the US) and their different languages. Using cutting-edge models, we benchmark their performance against human-coded data to assess whether URL-level analysis can approximate full-text analysis. Our findings show that URLs embed relevant information and can serve as a scalable, cost-effective alternative to discern PC. However, we also uncover systematic biases: LLMs seem to overclassify centrist news as political, leading to false positives that may distort further analyses. We conclude by outlining methodological recommendations on the use of LLMs in political science research.

classification, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2506.17435

Country:

Europe > Spain (0.35)
Europe > Germany (0.25)
Europe > France (0.25)

Genre: Research Report > New Finding (1.00)

Industry:

Government (0.94)
Media > News (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

FoQA: A Faroese Question-Answering Dataset

Simonsen, Annika, Nielsen, Dan Saattrup, Einarsson, Hafsteinn

arXiv.org Artificial IntelligenceFeb-11-2025

We present FoQA, a Faroese extractive question-answering (QA) dataset with 2,000 samples, created using a semi-automated approach combining Large Language Models (LLMs) and human validation. The dataset was generated from Faroese Wikipedia articles using GPT-4-turbo for initial QA generation, followed by question rephrasing to increase complexity and native speaker validation to ensure quality. We provide baseline performance metrics for FoQA across multiple models, including LLMs and BERT, demonstrating its effectiveness in evaluating Faroese QA performance. The dataset is released in three versions: a validated set of 2,000 samples, a complete set of all 10,001 generated samples, and a set of 2,395 rejected samples for error analysis.

annotator, dataset, faroese, (14 more...)

arXiv.org Artificial Intelligence

2502.07642

Country:

Europe > Iceland (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
Europe > Sweden (0.04)
(7 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Evaluation of Code LLMs on Geospatial Code Generation

Gramacki, Piotr, Martins, Bruno, Szymański, Piotr

arXiv.org Artificial IntelligenceDec-13-2024

Software development support tools have been studied for a long time, with recent approaches using Large Language Models (LLMs) for code generation. These models can generate Python code for data science and machine learning applications. LLMs are helpful for software engineers because they increase productivity in daily work. An LLM can also serve as a "mentor" for inexperienced software developers, and be a viable learning support. High-quality code generation with LLMs can also be beneficial in geospatial data science. However, this domain poses different challenges, and code generation LLMs are typically not evaluated on geospatial tasks. Here, we show how we constructed an evaluation benchmark for code generation models, based on a selection of geospatial tasks. We categorised geospatial tasks based on their complexity and required tools. Then, we created a dataset with tasks that test model capabilities in spatial reasoning, spatial data processing, and geospatial tools usage. The dataset consists of specific coding problems that were manually created for high quality. For every problem, we proposed a set of test scenarios that make it possible to automatically check the generated code for correctness. In addition, we tested a selection of existing code generation LLMs for code generation in the geospatial domain. We share our dataset and reproducible evaluation code on a public GitHub repository, arguing that this can serve as an evaluation benchmark for new LLMs in the future. Our dataset will hopefully contribute to the development new models capable of solving geospatial coding tasks with high accuracy. These models will enable the creation of coding assistants tailored for geospatial applications.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3687123.3698286

2410.04617

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Georgia > Fulton County > Atlanta (0.05)
Europe > Poland > Lower Silesia Province > Wroclaw (0.04)
(9 more...)

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Software (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs

Loweimi, Erfan, Carmantini, Andrea, Bell, Peter, Renals, Steve, Cvetkovic, Zoran

arXiv.org Artificial IntelligenceJun-2-2024

In this paper, we analyse the error patterns of the raw waveform acoustic models in TIMIT's phone recognition task. Our analysis goes beyond the conventional phone error rate (PER) metric. We categorise the phones into three groups: {affricate, diphthong, fricative, nasal, plosive, semi-vowel, vowel, silence}, {consonant, vowel+, silence}, and {voiced, unvoiced, silence} and, compute the PER for each broad phonetic class in each category. We also construct a confusion matrix for each category using the substitution errors and compare the confusion patterns with those of the Filterbank and Wav2vec 2.0 systems. Our raw waveform acoustic models consists of parametric (Sinc2Net) or non-parametric CNNs and Bidirectional LSTMs, achieving down to 13.7%/15.2% PERs on TIMIT Dev/Test sets, outperforming reported PERs for raw waveform models in the literature. We also investigate the impact of transfer learning from WSJ on the phonetic error patterns and confusion matrices. It reduces the PER to 11.8%/13.7% on the Dev/Test sets.

categorisation, confusion matrix, raw waveform model, (11 more...)

arXiv.org Artificial Intelligence

2406.00898

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Classifying COVID-19 vaccine narratives

Li, Yue, Scarton, Carolina, Song, Xingyi, Bontcheva, Kalina

arXiv.org Artificial IntelligenceNov-17-2023

Vaccine hesitancy is widespread, despite the government's information campaigns and the efforts of the World Health Organisation (WHO). Categorising the topics within vaccine-related narratives is crucial to understand the concerns expressed in discussions and identify the specific issues that contribute to vaccine hesitancy. This paper addresses the need for monitoring and analysing vaccine narratives online by introducing a novel vaccine narrative classification task, which categorises COVID-19 vaccine claims into one of seven categories. Following a data augmentation approach, we first construct a novel dataset for this new classification task, focusing on the minority classes. We also make use of fact-checker annotated data. The paper also presents a neural vaccine narrative classifier that achieves an accuracy of 84% under cross-validation. The classifier is publicly available for researchers and journalists.

dataset, narrative, vaccine, (17 more...)

arXiv.org Artificial Intelligence

2207.08522

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China (0.05)
Oceania > Australia > Victoria > Melbourne (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Vaccines (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

AI-Based Facial Emotion Recognition Solutions for Education: A Study of Teacher-User and Other Categories

Ravenor, R. Yamamoto

arXiv.org Artificial IntelligenceAug-29-2023

Existing information on AI-based facial emotion recognition (FER) is not easily comprehensible by those outside the field of computer science, requiring cross-disciplinary effort to determine a categorisation framework that promotes the understanding of this technology, and its impact on users. Most proponents classify FER in terms of methodology, implementation and analysis; relatively few by its application in education; and none by its users. This paper is concerned primarily with (potential) teacher-users of FER tools for education. It proposes a three-part classification of these teachers, by orientation, condition and preference, based on a classical taxonomy of affective educational objectives, and related theories. It also compiles and organises the types of FER solutions found in or inferred from the literature into "technology" and "applications" categories, as a prerequisite for structuring the proposed "teacher-user" category. This work has implications for proponents', critics', and users' understanding of the relationship between teachers and FER.

artificial intelligence, machine learning, student, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.54364/AAIML.2024.42122

2308.15119

Country:

Europe (0.93)
Asia > Japan (0.28)
North America > United States (0.28)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Education > Educational Setting (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.94)
Education > Curriculum > Subject-Specific Education (0.93)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)

Add feedback

Categorising Products in an Online Marketplace: An Ensemble Approach

Drumm, Kieron

arXiv.org Artificial IntelligenceApr-26-2023

In recent years, product categorisation has been a common issue for E-commerce companies who have utilised machine learning to categorise their products automatically. In this study, we propose an ensemble approach, using a combination of different models to separately predict each product's category, subcategory, and colour before ultimately combining the resultant predictions for each product. With the aforementioned approach, we show that an average F1-score of 0.82 can be achieved using a combination of XGBoost and k-nearest neighbours to predict said features.

artificial intelligence, category, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2304.13852

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Europe > Spain (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre: Research Report > New Finding (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.40)

Add feedback

Searching for Discriminative Words in Multidimensional Continuous Feature Space

Sajgalik, Marius, Barla, Michal, Bielikova, Maria

arXiv.org Artificial IntelligenceNov-26-2022

Word feature vectors have been proven to improve many NLP tasks. With recent advances in unsupervised learning of these feature vectors, it became possible to train it with much more data, which also resulted in better quality of learned features. Since it learns joint probability of latent features of words, it has the advantage that we can train it without any prior knowledge about the goal task we want to solve. We aim to evaluate the universal applicability property of feature vectors, which has been already proven to hold for many standard NLP tasks like part-of-speech tagging or syntactic parsing. In our case, we want to understand the topical focus of text documents and design an efficient representation suitable for discriminating different topics. The discriminativeness can be evaluated adequately on text categorisation task. We propose a novel method to extract discriminative keywords from documents. We utilise word feature vectors to understand the relations between words better and also understand the latent topics which are discussed in the text and not mentioned directly but inferred logically. We also present a simple way to calculate document feature vectors out of extracted discriminative words. We evaluate our method on the four most popular datasets for text categorisation. We show how different discriminative metrics influence the overall results. We demonstrate the effectiveness of our approach by achieving state-of-the-art results on text categorisation task using just a small number of extracted keywords. We prove that word feature vectors can substantially improve the topical inference of documents' meaning. We conclude that distributed representation of words can be used to build higher levels of abstraction as we demonstrate and build feature vectors of documents.

artificial intelligence, keyword, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.csl.2017.10.002

2211.14631

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Oceania > New Zealand > North Island > Waikato > Hamilton (0.04)
Oceania > Australia > Queensland (0.04)
(17 more...)

Genre:

Research Report > Experimental Study (0.46)
Research Report > Promising Solution (0.34)

Industry:

Leisure & Entertainment (1.00)
Transportation (0.67)
Banking & Finance (0.67)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Some Languages are More Equal than Others: Probing Deeper into the Linguistic Disparity in the NLP World

Ranathunga, Surangika, de Silva, Nisansa

arXiv.org Artificial IntelligenceOct-19-2022

Linguistic disparity in the NLP world is a problem that has been widely acknowledged recently. However, different facets of this problem, or the reasons behind this disparity are seldom discussed within the NLP community. This paper provides a comprehensive analysis of the disparity that exists within the languages of the world. We show that simply categorising languages considering data availability may not be always correct. Using an existing language categorisation based on speaker population and vitality, we analyse the distribution of language data resources, amount of NLP/CL research, inclusion in multilingual web-based platforms and the inclusion in pre-trained multilingual models. We show that many languages do not get covered in these resources or platforms, and even within the languages belonging to the same language group, there is wide disparity. We analyse the impact of family, geographical location, GDP and the speaker population of languages and provide possible reasons for this disparity, along with some suggestions to overcome the same.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2210.08523

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Asia > Sri Lanka (0.04)
(43 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (0.93)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Information Management (0.94)
(6 more...)

Add feedback