AITopics | main idea

Collaborating Authors

main idea

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis

Parfenova, Angelina, Marfurt, Andreas, Denzler, Alexander, Pfeffer, Juergen

arXiv.org Artificial IntelligenceDec-2-2025

This paper investigates the automation of qualitative data analysis, focusing on inductive coding using large language models (LLMs). Unlike traditional approaches that rely on deductive methods with predefined labels, this research investigates the inductive process where labels emerge from the data. The study evaluates the performance of six open-source LLMs compared to human experts. As part of the evaluation, experts rated the perceived difficulty of the quotes they coded. The results reveal a peculiar dichotomy: human coders consistently perform well when labeling complex sentences but struggle with simpler ones, while LLMs exhibit the opposite trend. Additionally, the study explores systematic deviations in both human and LLM generated labels by comparing them to the golden standard from the test set. While human annotations may sometimes differ from the golden standard, they are often rated more favorably by other humans. In contrast, some LLMs demonstrate closer alignment with the true labels but receive lower evaluations from experts.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.findings-naacl.361

2512.00046

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Savaal: Scalable Concept-Driven Question Generation to Enhance Human Learning

Noorbakhsh, Kimia, Chandler, Joseph, Karimi, Pantea, Alizadeh, Mohammad, Balakrishnan, Hari

arXiv.org Artificial IntelligenceFeb-17-2025

Assessing and enhancing human learning through question-answering is vital, yet automating this process remains challenging. While large language models (LLMs) excel at summarization and query responses, their ability to generate meaningful questions for learners is underexplored. We propose Savaal, a scalable question-generation system with three objectives: (i) scalability, enabling question generation from hundreds of pages of text (ii) depth of understanding, producing questions beyond factual recall to test conceptual reasoning, and (iii) domain-independence, automatically generating questions across diverse knowledge areas. Instead of providing an LLM with large documents as context, Savaal improves results with a three-stage processing pipeline. Our evaluation with 76 human experts on 71 papers and PhD dissertations shows that Savaal generates questions that better test depth of understanding by 6.5X for dissertations and 1.5X for papers compared to a direct-prompting LLM baseline. Notably, as document length increases, Savaal's advantages in higher question quality and lower cost become more pronounced.

large language model, machine learning, question answering, (19 more...)

arXiv.org Artificial Intelligence

2502.12477

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Thailand > Bangkok > Bangkok (0.04)
(14 more...)

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Adaptive GNN for Image Analysis and Editing

Neural Information Processing SystemsJan-26-2025, 21:52:48 GMT

They introduce an adaptive GNN formulated as a label propagation system, which can be related to two CV operations: filtering and propagation. Their adaptive GNN is designed based on guided map, graph Laplacian and node weight. The guided map and node weight are associated with filtering and propagation diffusion task in computer vision, and kernel of graph Laplacian is related to the diffusion pattern in computer vision task. They applied their model for quotient image analysis (QIA) and designed various illumination editing tasks for faces and scenes.

experiment, gnn, image analysis and editing, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Vision (1.00)

Add feedback

AdEval: Alignment-based Dynamic Evaluation to Mitigate Data Contamination in Large Language Models

Fan, Yang

arXiv.org Artificial IntelligenceJan-23-2025

As Large Language Models (LLMs) are pretrained on massive-scale corpora, the issue of data contamination has become increasingly severe, leading to potential overestimation of model performance during evaluation. To address this, we propose AdEval (Alignment-based Dynamic Evaluation), a dynamic data evaluation method aimed at mitigating the impact of data contamination on evaluation reliability. AdEval extracts key knowledge points and main ideas to align dynamically generated questions with static data's core concepts. It also leverages online search to provide detailed explanations of related knowledge points, thereby creating high-quality evaluation samples with robust knowledge support. Furthermore, AdEval incorporates mechanisms to control the number and complexity of questions, enabling dynamic alignment and flexible adjustment. This ensures that the generated questions align with the complexity of static data while supporting varied complexity levels. Based on Bloom's taxonomy, AdEval conducts a multi-dimensional evaluation of LLMs across six cognitive levels: remembering, understanding, applying, analyzing, evaluating, and creating. Experimental results on multiple datasets demonstrate that AdEval effectively reduces the impact of data contamination on evaluation outcomes, enhancing both the fairness and reliability of the evaluation process.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.13983

Country:

Europe > Ukraine > Kyiv Oblast > Kyiv (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (1.00)

Industry: Law (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Rationale Behind Essay Scores: Enhancing S-LLM's Multi-Trait Essay Scoring with Rationale Generated by LLMs

Chu, SeongYeub, Kim, JongWoo, Wong, Bryan, Yi, MunYong

arXiv.org Artificial IntelligenceOct-18-2024

Existing automated essay scoring (AES) has solely relied on essay text without using explanatory rationales for the scores, thereby forgoing an opportunity to capture the specific aspects evaluated by rubric indicators in a fine-grained manner. This paper introduces Rationale-based Multiple Trait Scoring (RMTS), a novel approach for multi-trait essay scoring that integrates prompt-engineering-based large language models (LLMs) with a fine-tuning-based essay scoring model using a smaller large language model (S-LLM). RMTS uses an LLM-based trait-wise rationale generation system where a separate LLM agent generates trait-specific rationales based on rubric guidelines, which the scoring model uses to accurately predict multi-trait scores. Extensive experiments on benchmark datasets, including ASAP, ASAP++, and Feedback Prize, show that RMTS significantly outperforms state-of-the-art models and vanilla S-LLMs in trait-specific scoring. By assisting quantitative assessment with fine-grained qualitative rationales, RMTS enhances the trait-wise reliability, providing partial explanations about essays.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.14202

Genre: Research Report > New Finding (0.93)

Industry:

Education > Educational Setting (1.00)
Education > Assessment & Standards > Student Performance (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reviews: Dynamic-Depth Context Tree Weighting

Neural Information Processing SystemsOct-8-2024, 08:28:20 GMT

The paper develops a variation on Context Tree Weighting (CTW) which keeps memory costs low by adapting the depth of each branch to the extent that it aids prediction accuracy. The new algorithm, called Utile Context Tree Weighting (UCTW), is shown empirically in some illustrative examples to use less memory than fixed-depth CTW (since it can keep some branches short) and to be more effective under a memory bound (in which it must prune a node every time it expands a node). The experiments are, for the most part well designed to answer the questions being asked. One experiment that felt less well-posed was the T-Maze. The text says "We consider a maze of length 4. Thus we set K 3." What does that "thus" mean?

algorithm, context tree weighting, uctw, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.49)

Add feedback

How Well Can You Articulate that Idea? Insights from Automated Formative Assessment

Karizaki, Mahsa Sheikhi, Gnesdilow, Dana, Puntambekar, Sadhana, Passonneau, Rebecca J.

arXiv.org Artificial IntelligenceApr-17-2024

Automated methods are becoming increasingly integrated into studies of formative feedback on students' science explanation writing. Most of this work, however, addresses students' responses to short answer questions. We investigate automated feedback on students' science explanation essays, where students must articulate multiple ideas. Feedback is based on a rubric that identifies the main ideas students are prompted to include in explanatory essays about the physics of energy and mass, given their experiments with a simulated roller coaster. We have found that students generally improve on revised versions of their essays. Here, however, we focus on two factors that affect the accuracy of the automated feedback. First, we find that the main ideas in the rubric differ with respect to how much freedom they afford in explanations of the idea, thus explanation of a natural law is relatively constrained. Students have more freedom in how they explain complex relations they observe in their roller coasters, such as transfer of different forms of energy. Second, by tracing the automated decision process, we can diagnose when a student's statement lacks sufficient clarity for the automated tool to associate it more strongly with one of the main ideas above all others. This in turn provides an opportunity for teachers and peers to help students reflect on how to state their ideas more clearly.

cosine similarity, main idea, student, (17 more...)

arXiv.org Artificial Intelligence

2404.11682

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > Pennsylvania > Centre County > State College (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Education > Educational Setting > K-12 Education (0.95)
Education > Assessment & Standards > Assessment Methods (0.71)
Education > Curriculum > Subject-Specific Education (0.71)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Interpreting Themes from Educational Stories

Zhang, Yigeng, González, Fabio A., Solorio, Thamar

arXiv.org Artificial IntelligenceApr-8-2024

Reading comprehension continues to be a crucial research focus in the NLP community. Recent advances in Machine Reading Comprehension (MRC) have mostly centered on literal comprehension, referring to the surface-level understanding of content. In this work, we focus on the next level - interpretive comprehension, with a particular emphasis on inferring the themes of a narrative text. We introduce the first dataset specifically designed for interpretive comprehension of educational narratives, providing corresponding well-edited theme texts. The dataset spans a variety of genres and cultural origins and includes human-annotated theme keywords with varying levels of granularity. We further formulate NLP tasks under different abstractions of interpretive comprehension toward the main idea of a story. After conducting extensive experiments with state-of-the-art methods, we found the task to be both challenging and significant for NLP research.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2404.0525

Country:

Asia > India (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
(17 more...)

Genre: Research Report > New Finding (0.48)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

GPT-4 Understands Discourse at Least as Well as Humans Do

Shultz, Thomas, Wise, Jamie, Nobandegani, Ardavan Salehi

arXiv.org Artificial IntelligenceMar-25-2024

MILA, Quebec AI Institute Abstract We test whether a leading AI system GPT-4 understands discourse as well as humans do, using a standardized test of discourse comprehension. Participants are presented with brief stories and then answer eight yes/no questions probing their comprehension of the story. The questions are formatted to assess the separate impacts of directness (stated vs. implied) and salience (main idea vs. details). GPT-4 performs slightly, but not statistically significantly, better than humans given the very high level of human performance. Both GPT-4 and humans exhibit a strong ability to make inferences about information that is not explicitly stated in a story, a critical test of understanding.

discourse comprehension test, gpt-4, information, (11 more...)

arXiv.org Artificial Intelligence

2403.17196

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > New York (0.05)
North America > Canada > Quebec > Montreal (0.05)
North America > United States > Arizona > Pima County > Tucson (0.04)

Genre: Research Report (0.82)

Industry:

Education (0.48)
Health & Medicine > Therapeutic Area > Neurology (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

49b8b4f95f02e055801da3b4f58e28b7-Reviews.html

Neural Information Processing SystemsMar-13-2024, 16:22:27 GMT

The novelty itself does not feel ground breaking due to this. The paper is also lacking in presentation. I can't see people outside a small community to follow this paper through without significant difficulties. I think the main ideas could be nicely summarised in one or two paragraphs but it is currently a pain to extract these; the notation is guesswork and will frustrate a reader who is not from the field or wants to go into details. There is no theory to support the density estimation point they make and the covariance approximation bounds are also not super significant since they make strong assumptions and the bounds seem to be not very tight. Also the link to paper [1] needs to be pointed out clearly with a proper discussion. On the plus side is Table 1 with the experimental results which seem promising.

covariance matrix, experimental result, low rank approximation, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.35)

Add feedback