AITopics | tf-idf 0

Collaborating Authors

tf-idf 0

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

77b5aaf2826c95c98e5eb4ab830073de-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 22:30:54 GMT

Asystem of regions (also referred to as anetwork) can comprise multiple disjoint regions that exhibit shared activity patterns across a range of tasks. The auditory system is located in the superior temporal region of the brain. This region uniquely encodespitch, speech, and music, butisnot involvedinhigh-levellanguage comprehensionandproduction[Norman-Haignereetal.,2015,2019].Inourexperimentspertaining to programming language comprehension, we use the activity seen in the auditory system as a negativecontrol. ForthePython program comprehension experiment, individual programs were modeled using the period from the onset of the code/sentence problem until the buttonpress. See Fedorenko et al. [2010] for a discussion of the functional localization approach as it pertains to the language network.

artificial intelligence, code property, machine learning, (18 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.46)

Add feedback

A Brain regions

Neural Information Processing SystemsAug-16-2025, 02:08:57 GMT

A system of regions (also referred to as a network) can comprise multiple disjoint regions that exhibit shared activity patterns across a range of tasks. The auditory system is located in the superior temporal region of the brain. The voxels were then filtered using gray-matter masking and (for MD and the Language systems) network localization. See Fedorenko et al. [2010] for a discussion of the functional localization approach as it pertains to the language network. For each brain system and each code property or code model, we run a separate MVP A analysis.

brain region, code model, code property, (17 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.68)
Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Software > Programming Languages (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.47)

Add feedback

PromotionGo at SemEval-2025 Task 11: A Feature-Centric Framework for Cross-Lingual Multi-Emotion Detection in Short Texts

Huang, Ziyi, Cui, Xia

arXiv.org Artificial IntelligenceJul-14-2025

This paper presents our system for SemEval 2025 Task 11: Bridging the Gap in Text-Based Emotion Detection (Track A), which focuses on multi-label emotion detection in short texts. We propose a feature-centric framework that dynamically adapts document representations and learning algorithms to optimize language-specific performance. Our study evaluates three key components: document representation, dimensionality reduction, and model training in 28 languages, highlighting five for detailed analysis. The results show that TF-IDF remains highly effective for low-resource languages, while contextual embeddings like FastText and transformer-based document representations, such as those produced by Sentence-BERT, exhibit language-specific strengths. Principal Component Analysis (PCA) reduces training time without compromising performance, particularly benefiting FastText and neural models such as Multi-Layer Perceptrons (MLP). Computational efficiency analysis underscores the trade-off between model complexity and processing cost. Our framework provides a scalable solution for multilingual emotion detection, addressing the challenges of linguistic diversity and resource constraints.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2507.08499

Country:

Europe > Austria (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)

Add feedback

AIGCodeSet: A New Annotated Dataset for AI Generated Code Detection

Demirok, Basak, Kutlu, Mucahid

arXiv.org Artificial IntelligenceDec-21-2024

With the rapid advancement of LLM models, they have become widely useful in various fields. While these AI systems can be used for code generation, significantly simplifying and accelerating the tasks of developers, their use for students to do assignments has raised ethical questions in the field of education. In this context, determining the author of a particular code becomes important. In this study, we introduce AIGCodeSet, a dataset for AI-generated code detection tasks, specifically for the Python programming language. We obtain the problem descriptions and human-written codes from the CodeNet dataset. Using the problem descriptions, we generate AI-written codes with CodeLlama 34B, Codestral 22B, and Gemini 1.5 Flash models in three approaches: i) generating code from the problem description alone, ii) generating code using the description along with human-written source code containing runtime errors, and iii) generating code using the problem description and human-written code that resulted in wrong answers. Lastly, we conducted a post-processing step to eliminate LLM output irrelevant to code snippets. Overall, AIGCodeSet consists of 2,828 AI-generated and 4,755 human-written code snippets. We share our code with the research community to support studies on this important topic and provide performance results for baseline AI-generated code detection methods.

large language model, machine learning, programming language, (21 more...)

arXiv.org Artificial Intelligence

2412.16594

Genre: Research Report > New Finding (0.48)

Industry:

Education (0.48)
Information Technology (0.47)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MELO: An Evaluation Benchmark for Multilingual Entity Linking of Occupations

Retyk, Federico, Gasco, Luis, Carrino, Casimiro Pio, Deniz, Daniel, Zbib, Rabih

arXiv.org Artificial IntelligenceOct-10-2024

We present the Multilingual Entity Linking of Occupations (MELO) Benchmark, a new collection of 48 datasets for evaluating the linking of entity mentions in 21 languages to the ESCO Occupations multilingual taxonomy. MELO was built using high-quality, pre-existent human annotations. We conduct experiments with simple lexical models and general-purpose sentence encoders, evaluated as bi-encoders in a zero-shot setup, to establish baselines for future research.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.08319

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > Spain (0.04)
(26 more...)

Genre: Research Report (1.00)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)

Add feedback

MasonTigers at SemEval-2024 Task 1: An Ensemble Approach for Semantic Textual Relatedness

Goswami, Dhiman, Puspo, Sadiya Sayara Chowdhury, Raihan, Md Nishat, Emran, Al Nahian Bin, Ganguly, Amrita, Zampieri, Marcos

arXiv.org Artificial IntelligenceApr-5-2024

This paper presents the MasonTigers entry to the SemEval-2024 Task 1 - Semantic Textual Relatedness. The task encompasses supervised (Track A), unsupervised (Track B), and cross-lingual (Track C) approaches across 14 different languages. MasonTigers stands out as one of the two teams who participated in all languages across the three tracks. Our approaches achieved rankings ranging from 11th to 21st in Track A, from 1st to 8th in Track B, and from 5th to 12th in Track C. Adhering to the task-specific constraints, our best performing approaches utilize ensemble of statistical machine learning approaches combined with language-specific BERT based models and sentence transformers.

arabic, ensemble 0, lr 0, (15 more...)

arXiv.org Artificial Intelligence

2403.1499

Country:

North America > United States (0.06)
Asia > Middle East > Jordan (0.04)
Africa (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Evaluating Embeddings for One-Shot Classification of Doctor-AI Consultations

Ojo, Olumide Ebenezer, Adebanji, Olaronke Oluwayemisi, Gelbukh, Alexander, Calvo, Hiram, Feldman, Anna

arXiv.org Artificial IntelligenceFeb-6-2024

Effective communication between healthcare providers and patients is crucial to providing high-quality patient care. In this work, we investigate how Doctor-written and AI-generated texts in healthcare consultations can be classified using state-of-the-art embeddings and one-shot classification systems. By analyzing embeddings such as bag-of-words, character n-grams, Word2Vec, GloVe, fastText, and GPT2 embeddings, we examine how well our one-shot classification systems capture semantic information within medical consultations. Results show that the embeddings are capable of capturing semantic features from text in a reliable and adaptable manner. Overall, Word2Vec, GloVe and Character n-grams embeddings performed well, indicating their suitability for modeling targeted to this task. GPT2 embedding also shows notable performance, indicating its suitability for models tailored to this task as well. Our machine learning architectures significantly improved the quality of health conversations when training data are scarce, improving communication between patients and healthcare providers.

character n-gram 0, classification, dataset, (13 more...)

arXiv.org Artificial Intelligence

2402.04442

Country:

North America > Mexico > Mexico City > Mexico City (0.05)
South America (0.04)
North America > United States (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Health Care Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Disaster Tweets Classification using BERT-Based Language Model

Le, Anh Duc

arXiv.org Artificial IntelligenceJan-31-2022

Social networking services have became an important communication channel in time of emergency. The aim of this study is to create a machine learning language model that is able to investigate if a person or area was in danger or not. The ubiquitousness of smartphones enables people to announce an emergency they are observing in real-time. Because of this, more agencies are interested in programmatically monitoring Twitter (i.e. disaster relief organizations and news agencies). Design a language model that is able to understand and acknowledge when a disaster is happening based on the social network posts will become more and more necessary over time.

classification, experiment, transformer, (13 more...)

arXiv.org Artificial Intelligence

2202.00795

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Vietnam > Hanoi > Hanoi (0.05)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Vietnam > Kon Tum Province > Kon Tum (0.04)

Genre: Research Report > Experimental Study (0.35)

Industry:

Information Technology > Services (0.36)
Media > News (0.34)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

Add feedback

Hybrid consistency and plausibility verification of product data according to FIC

Schorr, Christian

arXiv.org Artificial IntelligenceFeb-3-2021

The labelling of food products in the EU is regulated by the Food Information of Customers (FIC). Companies are required to provide the corresponding information regarding nutrients and allergens among others. With the rise of e-commerce more and more food products are sold online. There are often errors in the online product descriptions regarding the FIC-relevant information due to low data quality in the vendors' product data base. In this paper we propose a hybrid approach of both rule-based and machine learning to verify nutrient declaration and allergen labelling according to FIC requirements. Special focus is given to the problem of false negatives in allergen prediction since this poses a significant health risk to customers. Results show that a neural net trained on a subset of the ingredients of a product is capable of predicting the allergens contained with a high reliability.

allergen, bow 0, tf-idf 0, (17 more...)

arXiv.org Artificial Intelligence

2102.02665

Country:

Oceania > New Zealand > North Island > Waikato (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Health & Safety > School Nutrition (1.00)
Materials > Chemicals (0.68)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

How to detect novelty in textual data streams? A comparative study of existing methods

Christophe, Clément, Velcin, Julien, Cugliari, Jairo, Suignard, Philippe, Boumghar, Manel

arXiv.org Machine LearningSep-11-2019

Since datasets with annotation for novelty at the document and/or word level are not easily available, we present a simulation framework that allows us to create different textual datasets in which we control the way novelty occurs. We also present a benchmark of existing methods for novelty detection in textual data streams. We define a few tasks to solve and compare several state-of-the-art methods. The simulation framework allows us to evaluate their performances according to a set of limited scenarios and test their sensitivity to some parameters.

artificial intelligence, data mining, natural language, (18 more...)

arXiv.org Machine Learning

1909.05099

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback