AITopics

2510.23081

Country:

Asia (0.93)
Europe > Austria (0.28)
North America > Mexico (0.28)

Genre:

Overview (0.68)
Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceNov-5-2025

Reflections from Research Roundtables at the Conference on Health, Inference, and Learning (CHIL) 2025

Alsentzer, Emily, Charpignon, Marie-Laure, Chen, Bill, D'Souza, Niharika, Fries, Jason, Jiang, Yixing, Kashyap, Aparajita, Kim, Chanwoo, Lee, Simon, Mandyam, Aishwarya, Mbilinyi, Ashery, Mehandru, Nikita, Nagesh, Nitish, Nuwagira, Brighton, Pierson, Emma, Pillai, Arvind, Sano, Akane, Syeda-Mahmood, Tanveer, Yadav, Shashank, Adhanom, Elias, Afza, Muhammad Umar, Archer, Amelia, Bedi, Suhana, Bikia, Vasiliki, Chang, Trenton, Chen, George H., Chen, Winston, Chiang, Erica, Choi, Edward, Ciora, Octavia, Dozie-Nnamah, Paz, Elsharief, Shaza, Engelhard, Matthew, Eshragh, Ali, Feng, Jean, Fessel, Josh, Fleming, Scott, Fong, Kei Sen, Frost, Thomas, Gadgil, Soham, Gichoya, Judy, Hershkovich, Leeor, Im, Sujeong, Jain, Bhavya, Jeanselme, Vincent, Jia, Furong, Jin, Qixuan, Jin, Yuxuan, Kapash, Daniel, Kapoor, Geetika, Kiafar, Behdokht, Kleiner, Matthias, Kraft, Stefan, Kumar, Annika, Kyung, Daeun, Liang, Zhongyuan, Lin, Joanna, Liu, Qianchu, Liu, Chang, Luan, Hongzhou, Lunt, Chris, López, Leopoldo Julían Lechuga, McDermott, Matthew B. A., Noroozizadeh, Shahriar, O'Brien, Connor, Oh, YongKyung, Ota, Mixail, Pfohl, Stephen, Pi, Meagan, Pias, Tanmoy Sarkar, Rocheteau, Emma, Sethi, Avishaan, Shirakawa, Toru, Silver, Anita, Simha, Neha, Stankeviciute, Kamile, Sunog, Max, Szolovits, Peter, Tang, Shengpu, Tang, Jialu, Tierney, Aaron, Valdovinos, John, Wallace, Byron, Wang, Will Ke, Washington, Peter, Weiss, Jeremy, Wolfe, Daniel, Wong, Emily, Yun, Hye Sun, Zhang, Xiaoman, Zhang, Xiao Yu Cindy, Jeong, Hayoung, Thakoor, Kaveri A.

The 6th annual Conference on Health, Inference, and Learning (CHIL 2025), hosted by the Association for Health Learning and Inference (AHLI), was held in person on June 25-27, 2025, at the University of California, Berkeley, in Berkeley, California, USA. As part of this year's program, we hosted Research Roundtables to catalyze collaborative, small-group dialogue around critical, timely topics at the intersection of machine learning and healthcare. Each roundtable was moderated by a team of senior and junior chairs who fostered open exchange, intellectual curiosity, and inclusive engagement. The sessions emphasized rigorous discussion of key challenges, creative exploration of emerging opportunities, and collective ideation toward actionable directions in the field. Overall, the Research Roundtables brought together a diverse mix of participants, including academic researchers, clinicians, industry professionals, and policy experts. In total, eight roundtables were held across two 30-minute sessions, with a brief transition break to allow participants to join multiple discussions.

data mining, large language model, machine learning, (18 more...)

2510.15217

Country: North America > United States > California > Alameda County > Berkeley (0.54)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.93)
Overview (0.92)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
(7 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning

Chen, Xinghao, Zhao, Anhao, Xia, Heming, Lu, Xuan, Wang, Hanlin, Chen, Yanjun, Zhang, Wei, Wang, Jian, Li, Wenjie, Shen, Xiaoyu

Large Language Models (LLMs) have shown impressive performance on complex tasks through Chain-of-Thought (CoT) reasoning. However, conventional CoT relies on explicitly verbalized intermediate steps, which constrains its broader applicability, particularly in abstract reasoning tasks beyond language. To address this, there has been growing research interest in \textit{latent CoT reasoning}, where the reasoning process is embedded within latent spaces. By decoupling reasoning from explicit language generation, latent CoT offers the promise of richer cognitive representations and facilitates more flexible, faster inference. This paper aims to present a comprehensive overview of this emerging paradigm and establish a systematic taxonomy. We analyze recent advances in methods, categorizing them from token-wise horizontal approaches to layer-wise vertical strategies. We then provide in-depth discussions of these methods, highlighting their design principles, applications, and remaining challenges. We hope that our survey provides a structured foundation for advancing this promising direction in LLM reasoning. The relevant papers will be regularly updated at https://github.com/EIT-NLP/Awesome-Latent-CoT.

artificial intelligence, large language model, natural language, (13 more...)

2505.16782

Country:

Europe > Austria (0.28)
Asia > China (0.28)
North America > United States > Minnesota (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.90)

Durrani, Nadir, Mousi, Basel, Dalvi, Fahim

Editing Across Languages: A Survey of Multilingual Knowledge Editing

While Knowledge Editing has been extensively studied in monolingual settings, it remains underexplored in multilingual contexts. This survey systematizes recent research on Multilingual Knowledge Editing (MKE), a growing subdomain of model editing focused on ensuring factual edits generalize reliably across languages. We present a comprehensive taxonomy of MKE methods, covering parameter-based, memory-based, fine-tuning, and hypernetwork approaches. We survey available benchmarks,summarize key findings on method effectiveness and transfer patterns, identify challenges in cross-lingual propagation, and highlight open problems related to language anisotropy, evaluation coverage, and edit scalability. Our analysis consolidates a rapidly evolving area and lays the groundwork for future progress in editable language-aware LLMs.

large language model, machine learning, natural language, (16 more...)

2505.14393

Country:

Europe (1.00)
North America (0.68)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre:

Overview (1.00)
Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

MARFT: Multi-Agent Reinforcement Fine-Tuning

Liao, Junwei, Wen, Muning, Wang, Jun, Zhang, Weinan

LLM-based Multi-Agent Systems have demonstrated remarkable capabilities in addressing complex, agentic tasks, from generating high-quality presentation slides to even conducting sophisticated scientific research. Meanwhile, RL has been widely recognized for its effectiveness in enhancing agent intelligence, but limited research has investigated the fine-tuning of LaMAS using foundational RL techniques. Moreover, the direct application of MARL methods to LaMAS introduces significant challenges, stemming from the unique characteristics and mechanisms inherent to LaMAS. To address these challenges, this article presents a comprehensive study of LLM-based MARL and proposes a novel paradigm termed Multi-Agent Reinforcement Fine-Tuning (MARFT). We introduce a brand-new MG called Flex-MG, which aligns with the LaMAS optimization in real-world applications and a universal algorithmic framework tailored specifically for LaMAS, outlining the conceptual foundations, key distinctions, and practical implementation strategies. We review the evolution from RL to RFT, setting the stage for a parallel analysis in the multi-agent domain. In the context of LaMAS, we elucidate critical differences between MARL and MARFT. These differences motivate a transition toward a LaMAS-oriented formulation of RFT. Central to this work is a robust and scalable MARFT framework. We detail the core algorithm and provide a complete, open-source implementation to facilitate adoption and further research. The latter sections of the paper explore real-world application perspectives and opening challenges in MARFT. By bridging theoretical underpinnings with practical methodologies, this work serves as a roadmap for researchers seeking to advance MARFT toward resilient and adaptive solutions in agentic systems. Our implementation of the proposed framework is publicly available at: https://github.com/jwliao-ai/MARFT.

artificial intelligence, deep learning, machine learning, (15 more...)

2504.16129

Country:

Asia (0.92)
North America > United States > Massachusetts (0.27)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (0.67)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Trustworthy AI Must Account for Interactions

Cresswell, Jesse C.

Trustworthy AI encompasses many aspirational aspects for aligning AI systems with human values, including fairness, privacy, robustness, explainability, and uncertainty quantification. Ultimately the goal of Trustworthy AI research is to achieve all aspects simultaneously. However, efforts to enhance one aspect often introduce unintended trade-offs that negatively impact others. In this position paper, we review notable approaches to these five aspects and systematically consider every pair, detailing the negative interactions that can arise. For example, applying differential privacy to model training can amplify biases, undermining fairness. Drawing on these findings, we take the position that current research practices of improving one or two aspects in isolation are insufficient. Instead, research on Trustworthy AI must account for interactions between aspects and adopt a holistic view across all relevant axes at once. To illustrate our perspective, we provide guidance on how practitioners can work towards integrated trust, examples of how interactions affect the financial industry, and alternative views.

artificial intelligence, machine learning, prediction, (15 more...)

2504.0717

Genre: Overview (0.87)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

PolyG: Adaptive Graph Traversal for Diverse GraphRAG Questions

Liu, Renjie, Jiang, Haitian, Yan, Xiao, Tang, Bo, Li, Jinyang

GraphRAG enhances large language models (LLMs) to generate quality answers for user questions by retrieving related facts from external knowledge graphs. However, current GraphRAG methods are primarily evaluated on and overly tailored for knowledge graph question answering (KGQA) benchmarks, which are biased towards a few specific question patterns and do not reflect the diversity of real-world questions. To better evaluate GraphRAG methods, we propose a complete four-class taxonomy to categorize the basic patterns of knowledge graph questions and use it to create PolyBench, a new GraphRAG benchmark encompassing a comprehensive set of graph questions. With the new benchmark, we find that existing GraphRAG methods fall short in effectiveness (i.e., quality of the generated answers) and/or efficiency (i.e., response time or token usage) because they adopt either a fixed graph traversal strategy or free-form exploration by LLMs for fact retrieval. However, different question patterns require distinct graph traversal strategies and context formation. To facilitate better retrieval, we propose PolyG, an adaptive GraphRAG approach by decomposing and categorizing the questions according to our proposed question taxonomy. Built on top of a unified interface and execution engine, PolyG dynamically prompts an LLM to generate a graph database query to retrieve the context for each decomposed basic question. Compared with SOTA GraphRAG methods, PolyG achieves a higher win rate in generation quality and has a low response latency and token cost. Our code and benchmark are open-source at https://github.com/Liu-rj/PolyG.

large language model, machine learning, question answering, (21 more...)

2504.02112

Country:

North America > United States (1.00)
Europe (0.93)

Genre:

Research Report (0.82)
Workflow (0.68)
Overview (0.67)

Industry: Information Technology (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Jamet, Alexandre Valentin, Vavouliotis, Georgios, Jiménez, Daniel A., Alvarez, Lluc, Casas, Marc

A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering

To alleviate the performance and energy overheads of contemporary applications with large data footprints, we propose the Two Level Perceptron (TLP) predictor, a neural mechanism that effectively combines predicting whether an access will be off-chip with adaptive prefetch filtering at the first-level data cache (L1D). TLP is composed of two connected microarchitectural perceptron predictors, named First Level Predictor (FLP) and Second Level Predictor (SLP). FLP performs accurate off-chip prediction by using several program features based on virtual addresses and a novel selective delay component. The novelty of SLP relies on leveraging off-chip prediction to drive L1D prefetch filtering by using physical addresses and the FLP prediction as features. TLP constitutes the first hardware proposal targeting both off-chip prediction and prefetch filtering using a multi-level perceptron hardware approach. TLP only requires 7KB of storage. To demonstrate the benefits of TLP we compare its performance with state-of-the-art approaches using off-chip prediction and prefetch filtering on a wide range of single-core and multi-core workloads. Our experiments show that TLP reduces the average DRAM transactions by 30.7% and 17.7%, as compared to a baseline using state-of-the-art cache prefetchers but no off-chip prediction mechanism, across the single-core and multi-core workloads, respectively, while recent work significantly increases DRAM transactions. As a result, TLP achieves geometric mean performance speedups of 6.2% and 11.8% across single-core and multi-core workloads, respectively. In addition, our evaluation demonstrates that TLP is effective independently of the L1D prefetching logic.

artificial intelligence, machine learning, prediction, (15 more...)

doi: 10.1109/HPCA57654.2024.00046

2403.15181

Country: North America > United States (0.46)

Genre:

Research Report > Promising Solution (0.48)
Overview > Innovation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.77)

Wulff, Dirk U., Mata, Rui

Advancing Cognitive Science with LLMs

Cognitive science faces ongoing challenges in knowledge synthesis and conceptual clarity, in part due to its multifaceted and interdisciplinary nature. Recent advances in artificial intelligence, particularly the development of large language models (LLMs), offer tools that may help to address these issues. This review examines how LLMs can support areas where the field has historically struggled, including establishing cross-disciplinary connections, formalizing theories, developing clear measurement taxonomies, achieving generalizability through integrated modeling frameworks, and capturing contextual and individual variation. We outline the current capabilities and limitations of LLMs in these domains, including potential pitfalls. Taken together, we conclude that LLMs can serve as tools for a more integrative and cumulative cognitive science when used judiciously to complement, rather than replace, human expertise.

artificial intelligence, large language model, natural language, (17 more...)

2511.00206

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Overview (0.88)
Research Report > New Finding (0.67)
Instructional Material > Course Syllabus & Notes (0.46)
Research Report > Experimental Study (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Cognitive Architectures (1.00)

Sabouri, Milad, Mansoury, Masoud, Lin, Kun, Mobasher, Bamshad

Effectiveness of LLMs in Temporal User Profiling for Recommendation

Effectively modeling the dynamic nature of user preferences is crucial for enhancing recommendation accuracy and fostering transparency in recommender systems. Traditional user profiling often overlooks the distinction between transitory short-term interests and stable long-term preferences. This paper examines the capability of leveraging Large Language Models (LLMs) to capture these temporal dynamics, generating richer user representations through distinct short-term and long-term textual summaries of interaction histories. Our observations suggest that while LLMs tend to improve recommendation quality in domains with more active user engagement, their benefits appear less pronounced in sparser environments. This disparity likely stems from the varying distinguishability of short-term and long-term preferences across domains; the approach shows greater utility where these temporal interests are more clearly separable (e.g., Movies\&TV) compared to domains with more stable user profiles (e.g., Video Games). This highlights a critical trade-off between enhanced performance and computational costs, suggesting context-dependent LLM application. Beyond predictive capability, this LLM-driven approach inherently provides an intrinsic potential for interpretability through its natural language profiles and attention weights. This work contributes insights into the practical capability and inherent interpretability of LLM-driven temporal user profiling, outlining new research directions for developing adaptive and transparent recommender systems.

large language model, machine learning, natural language, (16 more...)

2511.00176

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Leisure & Entertainment (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)