AITopics | taxnodes:Technology: Overviews

Plotting

taxnodes:Technology: Overviews

News Overviews Instructional Materials AI-Alerts Classics

Long-form factuality in large language models

Neural Information Processing SystemsMar-24-2025, 18:07:30 GMT

Large language models (LLMs) often generate content that contains factual errors when responding to fact-seeking prompts on open-ended topics. To benchmark a model's long-form factuality in open domains, we first use GPT-4 to generate LongFact, a prompt set comprising thousands of questions spanning 38 topics. We then propose that LLM agents can be used as automated evaluators for longform factuality through a method which we call Search-Augmented Factuality Evaluator (SAFE). SAFE utilizes an LLM to break down a long-form response into a set of individual facts and to evaluate the accuracy of each fact using a multi-step reasoning process comprising sending search queries to Google Search and determining whether a fact is supported by the search results. Furthermore, we propose extending F1 score as an aggregated metric for long-form factuality.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (1.00)
North America > United States > California (0.92)
Africa (0.92)
(2 more...)

Genre:

Research Report > Experimental Study (1.00)
Personal (1.00)
Overview (0.67)

Industry:

Media > Television (1.00)
Media > Music (1.00)
Media > Film (1.00)
(20 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Beam Tree Recursion

Neural Information Processing SystemsMar-24-2025, 06:48:53 GMT

Beam Tree Recursive Neural Network (BT-RvNN) was recently proposed as an extension of Gumbel Tree RvNN and it was shown to achieve state-of-the-art length generalization performance in ListOps while maintaining comparable performance on other tasks. However, although better than previous approaches in terms of memory usage, BT-RvNN can be still exorbitantly expensive. In this paper, we identify the main bottleneck in BT-RvNN's memory usage to be the entanglement of the scorer function and the recursive cell function. We propose strategies to remove this bottleneck and further simplify its memory usage. Overall, our strategies not only reduce the memory usage of BT-RvNN by 10 16 times but also create a new state-of-the-art in ListOps while maintaining similar performance in other tasks.

computational linguistic, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia > Middle East (1.00)
North America > United States > Minnesota (0.28)
North America > United States > California (0.28)

Genre: Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.93)

Add feedback

Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models

Minjia Zhang, Wenhan Wang, Xiaodong Liu, Jianfeng Gao, Yuxiong He

Neural Information Processing SystemsMar-24-2025, 00:59:28 GMT

Neural Information Processing Systems http://nips.cc/

machine learning, natural language, small world graph, (19 more...)

Neural Information Processing Systems

Genre: Overview (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.76)

Add feedback

An Identity and Interaction Based Network Forensic Analysis

Clarke, Nathan, Alotibi, Gaseb, Joy, Dany, Li, Fudong, Furnell, Steven, Alshumrani, Ali, Mohammed, Hussan

arXiv.org Artificial IntelligenceMar-24-2025

In todays landscape of increasing electronic crime, network forensics plays a pivotal role in digital investigations. It aids in understanding which systems to analyse and as a supplement to support evidence found through more traditional computer based investigations. However, the nature and functionality of the existing Network Forensic Analysis Tools (NFATs) fall short compared to File System Forensic Analysis Tools (FS FATs) in providing usable data. The analysis tends to focus upon IP addresses, which are not synonymous with user identities, a point of significant interest to investigators. This paper presents several experiments designed to create a novel NFAT approach that can identify users and understand how they are using network based applications whilst the traffic remains encrypted. The experiments build upon the prior art and investigate how effective this approach is in classifying users and their actions. Utilising an in-house dataset composed of 50 million packers, the experiments are formed of three incremental developments that assist in improving performance. Building upon the successful experiments, a proposed NFAT interface is presented to illustrate the ease at which investigators would be able to ask relevant questions of user interactions. The experiments profiled across 27 users, has yielded an average 93.3% True Positive Identification Rate (TPIR), with 41% of users experiencing 100% TPIR. Skype, Wikipedia and Hotmail services achieved a notably high level of recognition performance. The study has developed and evaluated an approach to analyse encrypted network traffic more effectively through the modelling of network traffic and to subsequently visualise these interactions through a novel network forensic analysis tool.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Artificial Intelligence

2503.18542

Country: Europe > United Kingdom > England (0.15)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

From Fragment to One Piece: A Survey on AI-Driven Graphic Design

Zou, Xingxing, Zhang, Wen, Zhao, Nanxuan

arXiv.org Artificial IntelligenceMar-24-2025

This survey provides a comprehensive overview of the advancements in Artificial Intelligence in Graphic Design (AIGD), focusing on integrating AI techniques to support design interpretation and enhance the creative process. We categorize the field into two primary directions: perception tasks, which involve understanding and analyzing design elements, and generation tasks, which focus on creating new design elements and layouts. The survey covers various subtasks, including visual element perception and generation, aesthetic and semantic understanding, layout analysis, and generation. We highlight the role of large language models and multimodal approaches in bridging the gap between localized visual features and global design intent. Despite significant progress, challenges remain to understanding human intent, ensuring interpretability, and maintaining control over multilayered compositions. This survey serves as a guide for researchers, providing information on the current state of AIGD and potential future directions\footnote{https://github.com/zhangtianer521/excellent\_Intelligent\_graphic\_design}.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2503.18641

Country:

North America > United States (0.14)
Asia > China (0.14)

Genre: Overview (1.00)

Industry: Media (0.46)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Natural Language Processing for Electronic Health Records in Scandinavian Languages: Norwegian, Swedish, and Danish

Woldaregay, Ashenafi Zebene, Lund, Jørgen Aarmo, Ngo, Phuong Dinh, Tayefi, Mariyam, Burman, Joel, Hansen, Stine, Sillesen, Martin Hylleholt, Dalianis, Hercules, Jenssen, Robert, Ole, Lindsetmo Rolf, Mikalsen, Karl Øyvind

arXiv.org Artificial IntelligenceMar-24-2025

Background: Clinical natural language processing (NLP) refers to the use of computational methods for extracting, processing, and analyzing unstructured clinical text data, and holds a huge potential to transform healthcare in various clinical tasks. Objective: The study aims to perform a systematic review to comprehensively assess and analyze the state-of-the-art NLP methods for the mainland Scandinavian clinical text. Method: A literature search was conducted in various online databases including PubMed, ScienceDirect, Google Scholar, ACM digital library, and IEEE Xplore between December 2022 and February 2024. Further, relevant references to the included articles were also used to solidify our search. The final pool includes articles that conducted clinical NLP in the mainland Scandinavian languages and were published in English between 2010 and 2024. Results: Out of the 113 articles, 18% (n=21) focus on Norwegian clinical text, 64% (n=72) on Swedish, 10% (n=11) on Danish, and 8% (n=9) focus on more than one language. Generally, the review identified positive developments across the region despite some observable gaps and disparities between the languages. There are substantial disparities in the level of adoption of transformer-based models. In essential tasks such as de-identification, there is significantly less research activity focusing on Norwegian and Danish compared to Swedish text. Further, the review identified a low level of sharing resources such as data, experimentation code, pre-trained models, and rate of adaptation and transfer learning in the region. Conclusion: The review presented a comprehensive assessment of the state-of-the-art Clinical NLP for electronic health records (EHR) text in mainland Scandinavian languages and, highlighted the potential barriers and challenges that hinder the rapid advancement of the field in the region.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2503.18539

Country:

Asia (1.00)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.67)
Research Report > New Finding (0.67)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

The Role of Artificial Intelligence in Enhancing Insulin Recommendations and Therapy Outcomes

Panagiotou, Maria, Stroemmen, Knut, Brigato, Lorenzo, de Galan, Bastiaan E., Mougiakakou, Stavroula

arXiv.org Artificial IntelligenceMar-24-2025

The growing worldwide incidence of diabetes requires more effective approaches for managing blood glucose levels. Insulin delivery systems have advanced significantly, with artificial intelligence (AI) playing a key role in improving their precision and adaptability. AI algorithms, particularly those based on reinforcement learning, allow for personalised insulin dosing by continuously adapting to an individual's responses. Despite these advancements, challenges such as data privacy, algorithm transparency, and accessibility still need to be addressed. Continued progress and validation in AI-driven insulin delivery systems promise to improve therapy outcomes further, offering people more effective and individualised management of their diabetes. This paper presents an overview of current strategies, key challenges, and future directions.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2503.18592

Country:

Europe (1.00)
North America > United States (0.68)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

SciClaims: An End-to-End Generative System for Biomedical Claim Analysis

Ortega, Raúl, Gómez-Pérez, José Manuel

arXiv.org Artificial IntelligenceMar-24-2025

Validating key claims in scientific literature, particularly in biomedical research, is essential for ensuring accuracy and advancing knowledge. This process is critical in sectors like the pharmaceutical industry, where rapid scientific progress requires automation and deep domain expertise. However, current solutions have significant limitations. They lack end-to-end pipelines encompassing all claim extraction, evidence retrieval, and verification steps; rely on complex NLP and information retrieval pipelines prone to multiple failure points; and often fail to provide clear, user-friendly justifications for claim verification outcomes. To address these challenges, we introduce SciClaims, an advanced system powered by state-of-the-art large language models (LLMs) that seamlessly integrates the entire scientific claim analysis process. SciClaims outperforms previous approaches in both claim extraction and verification without requiring additional fine-tuning, setting a new benchmark for automated scientific claim analysis.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.18526

Country:

Europe > Spain (0.14)
Europe > France (0.14)
North America > United States > Louisiana (0.14)
Asia > Thailand (0.14)

Genre:

Research Report (0.64)
Overview (0.46)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.90)

Add feedback

Near-optimal Active Reconstruction

Yang, Daniel

arXiv.org Artificial IntelligenceMar-24-2025

With the growing practical interest in vision-based tasks for autonomous systems, the need for efficient and complex methods becomes increasingly larger. In the rush to develop new methods with the aim to outperform the current state of the art, an analysis of the underlying theory is often neglected and simply replaced with empirical evaluations in simulated or real-world experiments. While such methods might yield favorable performance in practice, they are often less well understood, which prevents them from being applied in safety-critical systems. The goal of this work is to design an algorithm for the Next Best View (NBV) problem in the context of active object reconstruction, for which we can provide qualitative performance guarantees with respect to true optimality. To the best of our knowledge, no previous work in this field addresses such an analysis for their proposed methods. Based on existing work on Gaussian process optimization, we rigorously derive sublinear bounds for the cumulative regret of our algorithm, which guarantees near-optimality. Complementing this, we evaluate the performance of our algorithm empirically within our simulation framework. We further provide additional insights through an extensive study of potential objective functions and analyze the differences to the results of related work.

data mining, machine learning, objective function, (22 more...)

arXiv.org Artificial Intelligence

2503.18999

Country: North America > United States (0.27)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry: Energy (0.45)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback

A Survey of Large Language Model Agents for Question Answering

Yue, Murong

arXiv.org Artificial IntelligenceMar-24-2025

This paper surveys the development of large language model (LLM)-based agents for question answering (QA). Traditional agents face significant limitations, including substantial data requirements and difficulty in generalizing to new environments. LLM-based agents address these challenges by leveraging LLMs as their core reasoning engine. These agents achieve superior QA results compared to traditional QA pipelines and naive LLM QA systems by enabling interaction with external environments. We systematically review the design of LLM agents in the context of QA tasks, organizing our discussion across key stages: planning, question understanding, information retrieval, and answer generation. Additionally, this paper identifies ongoing challenges and explores future research directions to enhance the performance of LLM agent QA systems.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.19213

Country: North America (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (0.67)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback