AITopics

doi: 10.1007/978-981-95-4969-6_20

2509.25662

Country:

North America > United States (0.68)
Europe (0.46)

Genre: Research Report (0.50)

Industry:

Law (1.00)
Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (0.53)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.46)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.46)

arXiv.org Artificial IntelligenceNov-4-2025

ERGO: Entropy-guided Resetting for Generation Optimization in Multi-turn Language Models

Khalid, Haziq Mohammad, Jeyaganthan, Athikash, Do, Timothy, Fu, Yicheng, O'Brien, Sean, Sharma, Vasu, Zhu, Kevin

Large Language Models (LLMs) suffer significant performance degradation in multi-turn conversations when information is presented incrementally. Given that multi-turn conversations characterize everyday interactions with LLMs, this degradation poses a severe challenge to real world usability. We hypothesize that abrupt increases in model uncertainty signal misalignment in multi-turn LLM interactions, and we exploit this insight to dynamically realign conversational context. We introduce ERGO (Entropy-guided Resetting for Generation Optimization), which continuously quantifies internal uncertainty via Shannon entropy over next token distributions and triggers adaptive prompt consolidation when a sharp spike in entropy is detected. By treating uncertainty as a first class signal rather than a nuisance to eliminate, ERGO embraces variability in language and modeling, representing and responding to uncertainty. In multi-turn tasks with incrementally revealed instructions, ERGO yields a 56.6% average performance gain over standard baselines, increases aptitude (peak performance capability) by 24.7%, and decreases unreliability (variability in performance) by 35.3%, demonstrating that uncertainty aware interventions can improve both accuracy and reliability in conversational AI.

large language model, machine learning, natural language, (21 more...)

2510.14077

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Laban, Philippe, Hayashi, Hiroaki, Zhou, Yingbo, Neville, Jennifer

LLMs Get Lost In Multi-Turn Conversation

arXiv.org Artificial IntelligenceMay-12-2025

Large Language Models (LLMs) are conversational interfaces. As such, LLMs have the potential to assist their users not only when they can fully specify the task at hand, but also to help them define, explore, and refine what they need through multi-turn conversational exchange. Although analysis of LLM conversation logs has confirmed that underspecification occurs frequently in user instructions, LLM evaluation has predominantly focused on the single-turn, fully-specified instruction setting. In this work, we perform large-scale simulation experiments to compare LLM performance in single- and multi-turn settings. Our experiments confirm that all the top open- and closed-weight LLMs we test exhibit significantly lower performance in multi-turn conversations than single-turn, with an average drop of 39% across six generation tasks. Analysis of 200,000+ simulated conversations decomposes the performance degradation into two components: a minor loss in aptitude and a significant increase in unreliability. We find that LLMs often make assumptions in early turns and prematurely attempt to generate final solutions, on which they overly rely. In simpler terms, we discover that *when LLMs take a wrong turn in a conversation, they get lost and do not recover*.

large language model, machine learning, natural language, (18 more...)

2505.0612

Country: Asia (0.28)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Drushchak, Nazarii, Tyshchenko, Vladyslava, Polyakovska, Nataliya

Towards Responsible AI in Education: Hybrid Recommendation System for K-12 Students Case Study

arXiv.org Artificial IntelligenceFeb-27-2025

--The growth of Educational T echnology (EdT ech) has enabled highly personalized learning experiences through Artificial Intelligence (AI)-based recommendation systems tailored to each student's needs. However, these systems can unintentionally introduce biases, potentially limiting fair access to learning resources. This study presents a recommendation system for K-12 students, combining graph-based modeling and matrix factorization to provide personalized suggestions for extracurricular activities, learning resources, and volunteering opportunities. T o address fairness concerns, the system includes a framework to detect and reduce biases by analyzing feedback across protected student groups. This work highlights the need for continuous monitoring in educational recommendation systems to support equitable, transparent, and effective learning opportunities for all students. I NTRODUCTION The rapid advancement of Educational Technology (EdTech) has significantly reshaped traditional learning environments, enabling the delivery of personalized educational experiences tailored to individual students' needs. According to the U.S. Department of Education Office of Educational Technology, leveraging AI-based modern educational technologies has been pivotal in providing personalized pathways for learning, supporting adaptive and individualized instruction, and enhancing student engagement through innovative digital solutions 1 . This trend toward personalization in education underscores the importance of leveraging advanced recommendation systems to support student exploration and growth.

fairness, recommendation system, student, (12 more...)

2502.20354

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Europe > Ukraine > Lviv Oblast > Lviv (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)

Genre:

Instructional Material (1.00)
Overview (0.68)
Research Report (0.65)

Industry:

Education > Educational Technology (1.00)
Education > Educational Setting > K-12 Education (0.85)
Government > Regional Government > North America Government > United States Government (0.74)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

The GuardianSep-10-2024, 06:00:38 GMT

Nexus: A Brief History of Information Networks from the Stone Age to AI by Yuval Noah Harari review – rage against the machine

What jumps to mind when you think about the impending AI apocalypse? If you're partial to sci-fi movie cliches, you may envisage killer robots (with or without thick Austrian accents) rising up to terminate their hubristic creators. Or perhaps, a la The Matrix, you'll go for scary machines sucking energy out of our bodies as they distract us with a simulated reality. For Yuval Noah Harari, who has spent a lot of time worrying about AI over the past decade, the threat is less fantastical and more insidious. "In order to manipulate humans, there is no need to physically hook brains to computers," he writes in his engrossing new book Nexus.

harari, information network, nexus, (10 more...)

The Guardian

Country:

North America > United States > New York (0.05)
Europe > Austria (0.05)
Asia > Myanmar (0.05)

Industry:

Media (0.69)
Government (0.49)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

#artificialintelligenceDec-10-2021, 16:24:52 GMT

The limitations of scaling up AI language models

But the dominant approach to developing these models involves leveraging massive computational resources, which has consequences. Beyond the fact that training and deploying large language models can incur high technical costs, the requirements put the models beyond the reach of many organizations and institutions. Scaling also doesn't resolve the major problem of model bias and toxicity, which often creeps in from the data used to train the models. In a panel during the Conference on Neural Information Processing Systems (NeurIPS) 2021, experts from the field discussed how the research community should adapt as progress in language models continues to be driven by scaled-up algorithms. The panelists explored how to ensure that smaller institutions and can meaningfully research and audit large-scale systems, as well as ways that they can help to ensure that the systems behave as intended.

ai language model, benchmark, language model, (9 more...)

Genre: Research Report (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

#artificialintelligenceNov-14-2021, 22:00:15 GMT

No-code Platforms Set To Accelerate Data, AI Adoption

Talk of low-code, no-code platforms have been making the rounds of late, with Goldman Sacks injecting USD90 million into low-code software maker WSO2, while data automation platform Cascade Labs last week raised USD5.3 million. And as observed in a Washington Post report this month, the rapid rise of low-code has allowed non-computer scientists to create digital applications that were previously the domain of computer science graduates, while simultaneously opening the door to deliver fast and meaningful impact to organizations. But what exactly is a low-code, and what implications does it have on building a vibrant data culture or developing data-centric and AI applications? At its heart, low-code is essentially a development environment for creating application software by leveraging scripting and a graphical user interface (GUI). The ability to visually configure applications significantly speeds development over traditional programming languages such as C or Python.

low-code platform, no-code platform set, platform, (11 more...)

Technology:

Information Technology > Artificial Intelligence (0.71)
Information Technology > Graphics (0.57)
Information Technology > Human Computer Interaction > Interfaces (0.56)
Information Technology > Software > Programming Languages (0.36)

#artificialintelligenceSep-5-2021, 20:44:42 GMT

How to Not Lose Your Job to AI

The question is now: are we becoming irrelevant? Initially, that's what the future might look like when witnessing OpenAI's new platform, Cortex. Cortex essentially allows you to ask it, in "human speak," to code things for you. Take this demo clip for instance. In the full video, you can see Cortex put together a simple website from a couple of sentences.

aptitude, cortex, new reality, (12 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.58)
Information Technology > Artificial Intelligence > Machine Learning (0.36)

#artificialintelligenceJul-24-2021, 22:05:39 GMT

Best practices to build data literacy into your Gen Z workforce - Data Dreamer

This is a guest post by Kirk Borne, Ph.D., Chief Science Officer at DataPrime.ai, Kirk is also a consultant, astrophysicist, data scientist, blogger, data literacy advocate and renowned speaker, and is one of the most recognized names in the industry. A survey of 1,100 data practitioners and business leaders reported that 84% of organizations consider data literacy to be a core business skill, agreeing with the statement that the inability of the workforce to use and analyze data effectively can hamper their business success. In addition, 36% said data literacy is crucial to future-proofing their business. Another survey found that 75% of employees are not comfortable using data.

build data literacy, data literacy, workforce, (13 more...)

Genre: Overview (0.55)

Industry: Information Technology (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.71)

Cardoso, Lucas F. F., Santos, Vitor C. A., Francês, Regiane S. K., Prudêncio, Ricardo B. C., Alves, Ronnie C. O.

Decoding machine learning benchmarks

arXiv.org Machine LearningAug-19-2020

Despite the availability of benchmark machine learning (ML) repositories (e.g., UCI, OpenML), there is no standard evaluation strategy yet capable of pointing out which is the best set of datasets to serve as gold standard to test different ML algorithms. In recent studies, Item Response Theory (IRT) has emerged as a new approach to elucidate what should be a good ML benchmark. This work applied IRT to explore the well-known OpenML-CC18 benchmark to identify how suitable it is on the evaluation of classifiers. Several classifiers ranging from classical to ensembles ones were evaluated using IRT models, which could simultaneously estimate dataset difficulty and classifiers' ability. The Glicko-2 rating system was applied on the top of IRT to summarize the innate ability and aptitude of classifiers. It was observed that not all datasets from OpenML-CC18 are really useful to evaluate classifiers. Most datasets evaluated in this work (84%) contain easy instances in general (e.g., around 10% of difficult instances only). Also, 80% of the instances in half of this benchmark are very discriminating ones, which can be of great use for pairwise algorithm comparison, but not useful to push classifiers abilities. This paper presents this new evaluation methodology based on IRT as well as the tool decodIRT, developed to guide IRT estimation over ML benchmarks.

artificial intelligence, classifier, machine learning, (18 more...)

arXiv.org Machine Learning

2007.1487

Country:

South America > Brazil > Pará > Belém (0.04)
South America > Brazil > São Paulo (0.04)
South America > Brazil > Pernambuco > Recife (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Chess (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)