AITopics | examining

Collaborating Authors

examining

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models

Neural Information Processing SystemsMay-27-2025, 18:19:52 GMT

Recent advances in AI have been significantly driven by the capabilities of large language models (LLMs) to solve complex problems in ways that resemble human thinking. However, there is an ongoing debate about the extent to which LLMs are capable ofactual reasoning. Central to this debate are two key probabilistic concepts that are essential for connecting causesto their effects: the probability of necessity (PN) and the probability of sufficiency (PS). This paper introduces a framework that is both theoretical and practical, aimed at assessing how effectively LLMs are able to replicate real-world reasoning mechanisms using these probabilistic measures. By viewing LLMs as abstract machines that process information through a natural language interface, we examine the conditions under which it is possible to compute suitable approximations of PN and PS. Our research marks an important step towards gaining a deeper understanding of when LLMs are capable of reasoning, as illustrated by a series of math examples.

language model, probability, reasoning emerge, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Unequal Opportunities: Examining the Bias in Geographical Recommendations by Large Language Models

Dudy, Shiran, Tholeti, Thulasi, Ramachandranpillai, Resmi, Ali, Muhammad, Li, Toby Jia-Jun, Baeza-Yates, Ricardo

arXiv.org Artificial IntelligenceApr-9-2025

Recent advancements in Large Language Models (LLMs) have made them a popular information-seeking tool among end users. However, the statistical training methods for LLMs have raised concerns about their representation of under-represented topics, potentially leading to biases that could influence real-world decisions and opportunities. These biases could have significant economic, social, and cultural impacts as LLMs become more prevalent, whether through direct interactions--such as when users engage with chatbots or automated assistants--or through their integration into third-party applications (as agents), where the models influence decision-making processes and functionalities behind the scenes. Our study examines the biases present in LLMs recommendations of U.S. cities and towns across three domains: relocation, tourism, and starting a business. We explore two key research questions: (i) How similar LLMs responses are, and (ii) How this similarity might favor areas with certain characteristics over others, introducing biases. We focus on the consistency of LLMs responses and their tendency to over-represent or under-represent specific locations. Our findings point to consistent demographic biases in these recommendations, which could perpetuate a ``rich-get-richer'' effect that widens existing economic disparities.

artificial intelligence, geographical recommendation, large language model, (3 more...)

arXiv.org Artificial Intelligence

2504.05325

Genre:

Research Report > New Finding (0.53)
Research Report > Experimental Study (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks

Cuadron, Alejandro, Li, Dacheng, Ma, Wenjie, Wang, Xingyao, Wang, Yichuan, Zhuang, Siyuan, Liu, Shu, Schroeder, Luis Gaspar, Xia, Tian, Mao, Huanzhi, Thumiger, Nicholas, Desai, Aditya, Stoica, Ion, Klimovic, Ana, Neubig, Graham, Gonzalez, Joseph E.

arXiv.org Artificial IntelligenceFeb-12-2025

Large Reasoning Models (LRMs) represent a breakthrough in AI problem-solving capabilities, but their effectiveness in interactive environments can be limited. This paper introduces and analyzes overthinking in LRMs. A phenomenon where models favor extended internal reasoning chains over environmental interaction. Through experiments on software engineering tasks using SWE Bench Verified, we observe three recurring patterns: Analysis Paralysis, Rogue Actions, and Premature Disengagement. We propose a framework to study these behaviors, which correlates with human expert assessments, and analyze 4018 trajectories. We observe that higher overthinking scores correlate with decreased performance, with reasoning models exhibiting stronger tendencies toward overthinking compared to non-reasoning models. Our analysis reveals that simple efforts to mitigate overthinking in agentic environments, such as selecting the solution with the lower overthinking score, can improve model performance by almost 30% while reducing computational costs by 43%. These results suggest that mitigating overthinking has strong practical implications. We suggest that by leveraging native function-calling capabilities and selective reinforcement learning overthinking tendencies could be mitigated. We also open-source our evaluation framework and dataset to facilitate research in this direction at https://github.com/AlexCuadron/Overthinking.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.08235

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

From Creation to Curriculum: Examining the role of generative AI in Arts Universities

Sims, Atticus

arXiv.org Artificial IntelligenceDec-21-2024

The age of Artificial Intelligence (AI) is marked by its transformative "generative" capabilities, distinguishing it from prior iterations. This burgeoning characteristic of AI has enabled it to produce new and original content, inherently showcasing its creative prowess. This shift challenges and requires a recalibration in the realm of arts education, urging a departure from established pedagogies centered on human-driven image creation. The paper meticulously addresses the integration of AI tools, with a spotlight on Stable Diffusion (SD), into university arts curricula. Drawing from practical insights gathered from workshops conducted in July 2023, which culminated in an exhibition of AI-driven artworks, the paper aims to provide a roadmap for seamlessly infusing these tools into academic settings. Given their recent emergence, the paper delves into a comprehensive overview of such tools, emphasizing the intricate dance between artists, developers, and researchers in the open-source AI art world. This discourse extends to the challenges and imperatives faced by educational institutions. It presents a compelling case for the swift adoption of these avant-garde tools, underscoring the paramount importance of equipping students with the competencies required to thrive in an AI-augmented artistic landscape.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2412.16531

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre:

Instructional Material (0.93)
Overview (0.66)
Research Report (0.64)

Industry:

Education > Curriculum > Subject-Specific Education (0.68)
Education > Educational Setting (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks

Barman, Niyar R, Sharma, Krish, Aziz, Ashhar, Bajpai, Shashwat, Biswas, Shwetangshu, Sharma, Vasu, Jain, Vinija, Chadha, Aman, Sheth, Amit, Das, Amitava

arXiv.org Artificial IntelligenceAug-19-2024

The rapid advancement of text-to-image generation systems, exemplified by models like Stable Diffusion, Midjourney, Imagen, and DALL-E, has heightened concerns about their potential misuse. In response, companies like Meta and Google have intensified their efforts to implement watermarking techniques on AI-generated images to curb the circulation of potentially misleading visuals. However, in this paper, we argue that current image watermarking methods are fragile and susceptible to being circumvented through visual paraphrase attacks. The proposed visual paraphraser operates in two steps. First, it generates a caption for the given image using KOSMOS-2, one of the latest state-of-the-art image captioning systems. Second, it passes both the original image and the generated caption to an image-to-image diffusion system. During the denoising step of the diffusion pipeline, the system generates a visually similar image that is guided by the text caption. The resulting image is a visual paraphrase and is free of any watermarks. Our empirical findings demonstrate that visual paraphrase attacks can effectively remove watermarks from images. This paper provides a critical assessment, empirically revealing the vulnerability of existing watermarking techniques to visual paraphrase attacks. While we do not propose solutions to this issue, this paper serves as a call to action for the scientific community to prioritize the development of more robust watermarking techniques. Our first-of-its-kind visual paraphrase dataset and accompanying code are publicly available.

ai-generated image watermarking technique, brittleness, visual paraphrasing attack, (2 more...)

arXiv.org Artificial Intelligence

2408.10446

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

Examining the Effect of Implementation Factors on Deep Learning Reproducibility

Coakley, Kevin, Kirkpatrick, Christine R., Gundersen, Odd Erik

arXiv.org Artificial IntelligenceDec-11-2023

Reproducing published deep learning papers to validate their conclusions can be difficult due to sources of irreproducibility. We investigate the impact that implementation factors have on the results and how they affect reproducibility of deep learning studies. Three deep learning experiments were ran five times each on 13 different hardware environments and four different software environments. The analysis of the 780 combined results showed that there was a greater than 6% accuracy range on the same deterministic examples introduced from hardware or software environment variations alone. To account for these implementation factors, researchers should run their experiments multiple times in different hardware and software environments to verify their conclusions are not affected.

experiment, implementation factor, software environment, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/eScience55777.2022.00056

2312.06633

Country:

North America > United States > California > San Diego County > San Diego (0.05)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.05)

Genre: Research Report (1.00)

Industry: Information Technology (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Examining the Values Reflected by Children during AI Problem Formulation

Dwivedi, Utkarsh, Elsayed-ali, Salma, Bonsignore, Elizabeth, Kacorri, Hernisa

arXiv.org Artificial IntelligenceSep-27-2023

Understanding how children design and what they value in AI interfaces that allow them to explicitly train their models such as teachable machines, could help increase such activities' impact and guide the design of future technologies. In a co-design session using a modified storyboard, a team of 5 children (aged 7-13 years) and adult co-designers, engaged in AI problem formulation activities where they imagine their own teachable machines. Our findings, leveraging an established psychological value framework (the Rokeach Value Survey), illuminate how children conceptualize and embed their values in AI systems that they themselves devise to support their everyday activities. Specifically, we find that children's proposed ideas require advanced system intelligence, e.g. emotion detection and understanding the social relationships of a user. The underlying models could be trained under multiple modalities and any errors would be fixed by adding more data or by anticipating negative examples. Children's ideas showed they cared about family and expected machines to understand their social context before making decisions.

ai problem formulation, examining, value reflected

arXiv.org Artificial Intelligence

2309.15839

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)

Add feedback

Examining the Limitations of Computational Rumor Detection Models Trained on Static Datasets

Mu, Yida, Song, Xingyi, Bontcheva, Kalina, Aletras, Nikolaos

arXiv.org Artificial IntelligenceSep-20-2023

A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors. Past research has indicated that content-based (i.e., using solely source posts as input) rumor detection models tend to perform less effectively on unseen rumors. At the same time, the potential of context-based models remains largely untapped. The main contribution of this paper is in the in-depth evaluation of the performance gap between content and context-based models specifically on detecting new, unseen rumors. Our empirical findings demonstrate that context-based models are still overly dependent on the information derived from the rumors' source post and tend to overlook the significant role that contextual information can play. We also study the effect of data split strategies on classifier performance. Based on our experimental results, the paper also offers practical suggestions on how to minimize the effects of temporal concept drift in static datasets during the training of rumor detection methods.

computational rumor detection model, limitation, static dataset, (1 more...)

arXiv.org Artificial Intelligence

2309.11576

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.53)

Add feedback

Examining the Influence of Varied Levels of Domain Knowledge Base Inclusion in GPT-based Intelligent Tutors

Castleman, Blake, Turkcan, Mehmet Kerem

arXiv.org Artificial IntelligenceSep-16-2023

Recent advancements in large language models (LLMs) have facilitated the development of chatbots with sophisticated conversational capabilities. However, LLMs exhibit frequent inaccurate responses to queries, hindering applications in educational settings. In this paper, we investigate the effectiveness of integrating a knowledge base (KB) with LLM intelligent tutors to increase response reliability. To achieve this, we design a scaleable KB that affords educational supervisors seamless integration of lesson curricula, which is automatically processed by the intelligent tutoring system. We then detail an evaluation, where student participants were presented with questions about the artificial intelligence curriculum to respond to. GPT-4 intelligent tutors with varying hierarchies of KB access and human domain experts then assessed these responses. Lastly, students cross-examined the intelligent tutors' responses to the domain experts' and ranked their various pedagogical abilities. Results suggest that, although these intelligent tutors still demonstrate a lower accuracy compared to domain experts, the accuracy of the intelligent tutors increases when access to a KB is granted. We also observe that the intelligent tutors with KB access exhibit better pedagogical abilities to speak like a teacher and understand students than those of domain experts, while their ability to help students remains lagging behind domain experts.

domain knowledge base inclusion, examining, gpt-based intelligent tutor

arXiv.org Artificial Intelligence

2309.12367

Genre:

Instructional Material (0.87)
Research Report (0.69)

Industry: Education > Educational Technology > Educational Software > Computer Based Training (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Examining the Effect of Pre-training on Time Series Classification

Pu, Jiashu, Zhao, Shiwei, Cheng, Ling, Chang, Yongzhu, Wu, Runze, Lv, Tangjie, Zhang, Rongsheng

arXiv.org Artificial IntelligenceSep-11-2023

Although the pre-training followed by fine-tuning paradigm is used extensively in many fields, there is still some controversy surrounding the impact of pre-training on the fine-tuning process. Currently, experimental findings based on text and image data lack consensus. To delve deeper into the unsupervised pre-training followed by fine-tuning paradigm, we have extended previous research to a new modality: time series. In this study, we conducted a thorough examination of 150 classification datasets derived from the Univariate Time Series (UTS) and Multivariate Time Series (MTS) benchmarks. Our analysis reveals several key conclusions. (i) Pre-training can only help improve the optimization process for models that fit the data poorly, rather than those that fit the data well. (ii) Pre-training does not exhibit the effect of regularization when given sufficient training time. (iii) Pre-training can only speed up convergence if the model has sufficient ability to fit the data. (iv) Adding more pre-training data does not improve generalization, but it can strengthen the advantage of pre-training on the original data volume, such as faster convergence. (v) While both the pre-training task and the model structure determine the effectiveness of the paradigm on a given dataset, the model structure plays a more significant role.

dataset, model structure, pre-training task, (12 more...)

arXiv.org Artificial Intelligence

2309.05256

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Singapore (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback