AITopics

2501.04437

Country:

North America > United States (0.45)
Asia > Middle East (0.28)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJan-8-2025

Understanding Before Reasoning: Enhancing Chain-of-Thought with Iterative Summarization Pre-Prompting

Zhu, Dong-Hai, Xiong, Yu-Jie, Zhang, Jia-Chen, Xie, Xi-Jiong, Xia, Chun-Ming

Chain-of-Thought (CoT) Prompting is a dominant paradigm in Large Language Models (LLMs) to enhance complex reasoning. It guides LLMs to present multi-step reasoning, rather than generating the final answer directly. However, CoT encounters difficulties when key information required for reasoning is implicit or missing. This occurs because CoT emphasizes the sequence of reasoning steps while overlooking the early extraction of essential information. We propose a pre-prompting method called Iterative Summarization Pre-Prompting (ISP^2) to refine LLM reasoning when key information is not explicitly provided. First, entities and their corresponding descriptions are extracted to form potential key information pairs. Next, we use a reliability rating to assess these pairs, then merge the two lowest-ranked pairs into a new entity description. This process is repeated until a unique key information pair is obtained. Finally, that pair, along with the original question, is fed into LLMs to produce the answer. Extensive experiments demonstrate a 7.1% improvement compared to existing methods. Unlike traditional prompting, ISP^2 adopts an inductive approach with pre-prompting, offering flexible integration into diverse reasoning frameworks. The code is available at https://github.com/zdhgreat/ISP-2.

isp 2, large language model, machine learning, (19 more...)

2501.04341

Country:

Asia > China (0.29)
Europe (0.28)

Genre:

Overview (0.68)
Research Report (0.64)
Instructional Material (0.46)

Industry:

Education (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

arXiv.org Artificial IntelligenceJan-8-2025

LLM4SR: A Survey on Large Language Models for Scientific Research

Luo, Ziming, Yang, Zonglin, Xu, Zexin, Yang, Wei, Du, Xinya

In recent years, the rapid advancement of Large Language Models (LLMs) has transformed the landscape of scientific research, offering unprecedented support across various stages of the research cycle. This paper presents the first systematic survey dedicated to exploring how LLMs are revolutionizing the scientific research process. We analyze the unique roles LLMs play across four critical stages of research: hypothesis discovery, experiment planning and implementation, scientific writing, and peer reviewing. Our review comprehensively showcases the task-specific methodologies and evaluation benchmarks. By identifying current challenges and proposing future research directions, this survey not only highlights the transformative potential of LLMs, but also aims to inspire and guide researchers and practitioners in leveraging LLMs to advance scientific inquiry. Resources are available at the following repository: https://github.com/du-nlp-lab/LLM4SR

large language model, machine learning, natural language, (20 more...)

2501.04306

Country:

Asia > Middle East (0.92)
North America > United States > Texas (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.88)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Bahar, Atmane Ayoub Mansour, Ferrahi, Kamel Soaid, Messai, Mohamed-Lamine, Seba, Hamida, Amrouche, Karima

CONTINUUM: Detecting APT Attacks through Spatial-Temporal Graph Neural Networks

Advanced Persistent Threats (APTs) represent a significant challenge in cybersecurity due to their sophisticated and stealthy nature. Traditional Intrusion Detection Systems (IDS) often fall short in detecting these multi-stage attacks. Recently, Graph Neural Networks (GNNs) have been employed to enhance IDS capabilities by analyzing the complex relationships within networked data. However, existing GNN-based solutions are hampered by high false positive rates and substantial resource consumption. In this paper, we present a novel IDS designed to detect APTs using a Spatio-Temporal Graph Neural Network Autoencoder. Our approach leverages spatial information to understand the interactions between entities within a graph and temporal information to capture the evolution of the graph over time. This dual perspective is crucial for identifying the sequential stages of APTs. Furthermore, to address privacy and scalability concerns, we deploy our architecture in a federated learning environment. This setup ensures that local data remains on-premise while encrypted model-weights are shared and aggregated using homomorphic encryption, maintaining data privacy and security. Our evaluation shows that this system effectively detects APTs with lower false positive rates and optimized resource usage compared to existing methods, highlighting the potential of spatio-temporal analysis and federated learning in enhancing cybersecurity defenses.

artificial intelligence, detection, machine learning, (16 more...)

2501.02981

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Switzerland > Bern > Bern (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(4 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Sustainable and Intelligent Public Facility Failure Management System Based on Large Language Models

Bi, Siguo, Zhang, Jilong, Ni, Wei

This paper presents a new Large Language Model (LLM)-based Smart Device Management framework, a pioneering approach designed to address the intricate challenges of managing intelligent devices within public facilities, with a particular emphasis on applications to libraries. Our framework leverages state-of-the-art LLMs to analyze and predict device failures, thereby enhancing operational efficiency and reliability. Through prototype validation in real-world library settings, we demonstrate the framework's practical applicability and its capacity to significantly reduce budgetary constraints on public facilities. The advanced and innovative nature of our model is evident from its successful implementation in prototype testing. We plan to extend the framework's scope to include a wider array of public facilities and to integrate it with cutting-edge cybersecurity technologies, such as Internet of Things (IoT) security and machine learning algorithms for threat detection and response. This will result in a comprehensive and proactive maintenance system that not only bolsters the security of intelligent devices but also utilizes machine learning for automated analysis and real-time threat mitigation. By incorporating these advanced cybersecurity elements, our framework will be well-positioned to tackle the dynamic challenges of modern public infrastructure, ensuring robust protection against potential threats and enabling facilities to anticipate and prevent failures, leading to substantial cost savings and enhanced service quality.

large language model, machine learning, smart device, (21 more...)

2501.06231

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Switzerland (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Detecting Neurocognitive Disorders through Analyses of Topic Evolution and Cross-modal Consistency in Visual-Stimulated Narratives

Li, Jinchao, Wang, Yuejiao, Li, Junan, Kang, Jiawen, Zheng, Bo, Wong, Simon, Mak, Brian, Fung, Helene, Woo, Jean, Mak, Man-Wai, Kwok, Timothy, Mok, Vincent, Gong, Xianmin, Wu, Xixin, Liu, Xunying, Wong, Patrick, Meng, Helen

Early detection of neurocognitive disorders (NCDs) is crucial for timely intervention and disease management. Speech analysis offers a non-intrusive and scalable screening method, particularly through narrative tasks in neuropsychological assessment tools. Traditional narrative analysis often focuses on local indicators in microstructure, such as word usage and syntax. While these features provide insights into language production abilities, they often fail to capture global narrative patterns, or microstructures. Macrostructures include coherence, thematic organization, and logical progressions, reflecting essential cognitive skills potentially critical for recognizing NCDs. Addressing this gap, we propose to investigate specific cognitive and linguistic challenges by analyzing topical shifts, temporal dynamics, and the coherence of narratives over time, aiming to reveal cognitive deficits by identifying narrative impairments, and exploring their impact on communication and cognition. The investigation is based on the CU-MARVEL Rabbit Story corpus, which comprises recordings of a story-telling task from 758 older adults. We developed two approaches: the Dynamic Topic Models (DTM)-based temporal analysis to examine the evolution of topics over time, and the Text-Image Temporal Alignment Network (TITAN) to evaluate the coherence between spoken narratives and visual stimuli. DTM-based approach validated the effectiveness of dynamic topic consistency as a macrostructural metric (F1=0.61, AUC=0.78). The TITAN approach achieved the highest performance (F1=0.72, AUC=0.81), surpassing established microstructural and macrostructural feature sets. Cross-comparison and regression tasks further demonstrated the effectiveness of proposed dynamic macrostructural modeling approaches for NCD detection.

large language model, machine learning, natural language, (20 more...)

2501.03727

Country:

Asia > China > Hong Kong (0.05)
North America > United States (0.04)
Europe > Austria > Vienna (0.04)

Genre:

Research Report > New Finding (0.93)
Overview (0.93)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.75)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.69)
Health & Medicine > Therapeutic Area > Neurology > Dementia (0.47)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
(2 more...)

Li, Mohan, Gjoreski, Martin, Barbiero, Pietro, Slapničar, Gašper, Luštrek, Mitja, Lane, Nicholas D., Langheinrich, Marc

A Survey on Federated Learning in Human Sensing

Human Sensing, a field that leverages technology to monitor human activities, psycho-physiological states, and interactions with the environment, enhances our understanding of human behavior and drives the development of advanced services that improve overall quality of life. However, its reliance on detailed and often privacy-sensitive data as the basis for its machine learning (ML) models raises significant legal and ethical concerns. The recently proposed ML approach of Federated Learning (FL) promises to alleviate many of these concerns, as it is able to create accurate ML models without sending raw user data to a central server. While FL has demonstrated its usefulness across a variety of areas, such as text prediction and cyber security, its benefits in Human Sensing are under-explored, given the particular challenges in this domain. This survey conducts a comprehensive analysis of the current state-of-the-art studies on FL in Human Sensing, and proposes a taxonomy and an eight-dimensional assessment for FL approaches. Through the eight-dimensional assessment, we then evaluate whether the surveyed studies consider a specific FL-in-Human-Sensing challenge or not. Finally, based on the overall analysis, we discuss open challenges and highlight five research aspects related to FL in Human Sensing that require urgent research attention. Our work provides a comprehensive corpus of FL studies and aims to assist FL practitioners in developing and evaluating solutions that effectively address the real-world complexities of Human Sensing.

artificial intelligence, expert system, machine learning, (16 more...)

2501.04

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > New York > New York County > New York City (0.05)
(26 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology (1.00)
(2 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(6 more...)

TOAST Framework: A Multidimensional Approach to Ethical and Sustainable AI Integration in Organizations

Tjondronegoro, Dian

Artificial Intelligence (AI) has emerged as a transformative technology with the potential to revolutionize various sectors, from healthcare to finance, education, and beyond. However, successfully implementing AI systems remains a complex challenge, requiring a comprehensive and methodologically sound framework. This paper contributes to this challenge by introducing the Trustworthy, Optimized, Adaptable, and Socio-Technologically harmonious (TOAST) framework. It draws on insights from various disciplines to align technical strategy with ethical values, societal responsibilities, and innovation aspirations. The TOAST framework is a novel approach designed to guide the implementation of AI systems, focusing on reliability, accountability, technical advancement, adaptability, and socio-technical harmony. By grounding the TOAST framework in healthcare case studies, this paper provides a robust evaluation of its practicality and theoretical soundness in addressing operational, ethical, and regulatory challenges in high-stakes environments, demonstrating how adaptable AI systems can enhance institutional efficiency, mitigate risks like bias and data privacy, and offer a replicable model for other sectors requiring ethically aligned and efficient AI integration.

ai system, data mining, machine learning, (14 more...)

2502.00011

Country:

Europe > United Kingdom (0.14)
North America > United States > California (0.04)
Europe > Middle East > Malta > Northern Region > Western District > Attard (0.04)
(7 more...)

Genre:

Overview > Innovation (0.86)
Research Report > Experimental Study (0.67)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)
Health & Medicine > Health Care Providers & Services (0.94)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
Information Technology > Artificial Intelligence > Applied AI (1.00)
(2 more...)

TACLR: A Scalable and Efficient Retrieval-based Method for Industrial Product Attribute Value Identification

Su, Yindu, Zou, Huike, Sun, Lin, Zhang, Ting, Yang, Haiyang, Chen, Liyu, Lo, David, Zhang, Qingheng, Han, Shuguang, Chen, Jufeng

Product Attribute Value Identification (PAVI) involves identifying attribute values from product profiles, a key task for improving product search, recommendations, and business analytics on e-commerce platforms. However, existing PAVI methods face critical challenges, such as inferring implicit values, handling out-of-distribution (OOD) values, and producing normalized outputs. To address these limitations, we introduce Taxonomy-Aware Contrastive Learning Retrieval (TACLR), the first retrieval-based method for PAVI. TACLR formulates PAVI as an information retrieval task by encoding product profiles and candidate values into embeddings and retrieving values based on their similarity to the item embedding. It leverages contrastive training with taxonomy-aware hard negative sampling and employs adaptive inference with dynamic thresholds. TACLR offers three key advantages: (1) it effectively handles implicit and OOD values while producing normalized outputs; (2) it scales to thousands of categories, tens of thousands of attributes, and millions of values; and (3) it supports efficient inference for high-load industrial scenarios. Extensive experiments on proprietary and public datasets validate the effectiveness and efficiency of TACLR. Moreover, it has been successfully deployed in a real-world e-commerce platform, processing millions of product listings daily while supporting dynamic, large-scale attribute taxonomies.

computational linguistic, extraction, proceedings, (16 more...)

2501.03835

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
North America > Canada > Ontario > Toronto (0.05)
Asia > Singapore (0.04)
(7 more...)

Genre:

Research Report (0.50)
Overview (0.47)

Industry: Information Technology (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.97)
Information Technology > Data Science > Data Mining (0.89)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

A Survey on LLM-based Multi-Agent System: Recent Advances and New Frontiers in Application

Chen, Shuaihang, Liu, Yuanxing, Han, Wei, Zhang, Weinan, Liu, Ting

LLM-based Multi-Agent Systems ( LLM-MAS ) have become a research hotspot since the rise of large language models (LLMs). However, with the continuous influx of new related works, the existing reviews struggle to capture them comprehensively. This paper presents a comprehensive survey of these studies. We first discuss the definition of LLM-MAS, a framework encompassing much of previous work. We provide an overview of the various applications of LLM-MAS in (i) solving complex tasks, (ii) simulating specific scenarios, and (iii) evaluating generative agents. Building on previous studies, we also highlight several challenges and propose future directions for research in this field.

agent, llm-ma, preprint, (15 more...)

2412.17481

Country:

Asia > Thailand > Bangkok > Bangkok (0.05)
North America > Mexico > Mexico City > Mexico City (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)