AITopics | Bangor

Large language models (LLMs) exhibit remarkable generative capabilities but often suffer from hallucinations. Retrieval-augmented generation (RAG) offers an effective solution by incorporating external knowledge, but existing methods still face several limitations: additional deployment costs of separate retrievers, redundant input tokens from retrieved text chunks, and the lack of joint optimization of retrieval and generation. To address these issues, we propose \textbf{RetroLLM}, a unified framework that integrates retrieval and generation into a single, cohesive process, enabling LLMs to directly generate fine-grained evidence from the corpus with constrained decoding. Moreover, to mitigate false pruning in the process of constrained evidence generation, we introduce (1) hierarchical FM-Index constraints, which generate corpus-constrained clues to identify a subset of relevant documents before evidence generation, reducing irrelevant decoding space; and (2) a forward-looking constrained decoding strategy, which considers the relevance of future sequences to improve evidence accuracy. Extensive experiments on five open-domain QA datasets demonstrate RetroLLM's superior performance across both in-domain and out-of-domain tasks. The code is available at \url{https://github.com/sunnynexus/RetroLLM}.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2412.11919

Country:

North America > United States > District of Columbia > Washington (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > California > San Francisco County > San Francisco (0.04)
(24 more...)

Genre:

Personal > Honors (1.00)
Research Report > New Finding (0.67)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Media > Music (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models

Jin, Zhuoran, Cao, Pengfei, Wang, Chenhao, He, Zhitao, Yuan, Hongbang, Li, Jiachun, Chen, Yubo, Liu, Kang, Zhao, Jun

arXiv.org Artificial IntelligenceJun-16-2024

Large language models (LLMs) inevitably memorize sensitive, copyrighted, and harmful knowledge from the training corpus; therefore, it is crucial to erase this knowledge from the models. Machine unlearning is a promising solution for efficiently removing specific knowledge by post hoc modifying models. In this paper, we propose a Real-World Knowledge Unlearning benchmark (RWKU) for LLM unlearning. RWKU is designed based on the following three key factors: (1) For the task setting, we consider a more practical and challenging unlearning setting, where neither the forget corpus nor the retain corpus is accessible. (2) For the knowledge source, we choose 200 real-world famous people as the unlearning targets and show that such popular knowledge is widely present in various LLMs. (3) For the evaluation framework, we design the forget set and the retain set to evaluate the model's capabilities across various real-world applications. Regarding the forget set, we provide four four membership inference attack (MIA) methods and nine kinds of adversarial attack probes to rigorously test unlearning efficacy. Regarding the retain set, we assess locality and utility in terms of neighbor perturbation, general ability, reasoning ability, truthfulness, factuality, and fluency. We conduct extensive experiments across two unlearning scenarios, two models and six baseline methods and obtain some meaningful findings. We release our benchmark and code publicly at http://rwku-bench.github.io for future work.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.1089

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Indiana (0.04)
(21 more...)

Genre:

Personal (1.00)
Research Report > Promising Solution (0.34)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Law (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Measuring Vision-Language STEM Skills of Neural Models

Shen, Jianhao, Yuan, Ye, Mirzoyan, Srbuhi, Zhang, Ming, Wang, Chenguang

arXiv.org Artificial IntelligenceMay-22-2024

We introduce a new challenge to test the STEM skills of neural models. The problems in the real world often require solutions, combining knowledge from STEM (science, technology, engineering, and math). Unlike existing datasets, our dataset requires the understanding of multimodal vision-language information of STEM. Our dataset features one of the largest and most comprehensive datasets for the challenge. It includes 448 skills and 1,073,146 questions spanning all STEM subjects. Compared to existing datasets that often focus on examining expert-level ability, our dataset includes fundamental skills and questions designed based on the K-12 curriculum. We also add state-of-the-art foundation models such as CLIP and GPT-3.5-Turbo to our benchmark. Results show that the recent model advances only help master a very limited number of lower grade-level skills (2.5% in the third grade) in our dataset. In fact, these models are still well below (averaging 54.7%) the performance of elementary students, not to mention near expert-level performance. To understand and increase the performance on our dataset, we teach the models on a training split of our dataset. Even though we observe improved performance, the model performance remains relatively low compared to average elementary students. To solve STEM problems, we will need novel algorithmic innovations from the community.

answer index, conference paper, dataset, (16 more...)

arXiv.org Artificial Intelligence

2402.17205

Country:

North America > United States > California (0.04)
Asia > China (0.04)
South America > Ecuador (0.04)
(7 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (1.00)
Materials (0.92)
Leisure & Entertainment > Games > Computer Games (0.92)
Education > Educational Setting > K-12 Education > Primary School (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy

Shao, Zhihong, Gong, Yeyun, Shen, Yelong, Huang, Minlie, Duan, Nan, Chen, Weizhu

arXiv.org Artificial IntelligenceOct-23-2023

Large language models are powerful text processors and reasoners, but are still subject to limitations including outdated knowledge and hallucinations, which necessitates connecting them to the world. Retrieval-augmented large language models have raised extensive attention for grounding model generation on external knowledge. However, retrievers struggle to capture relevance, especially for queries with complex information needs. Recent work has proposed to improve relevance modeling by having large language models actively involved in retrieval, i.e., to improve retrieval with generation. In this paper, we show that strong performance can be achieved by a method we call Iter-RetGen, which synergizes retrieval and generation in an iterative manner. A model output shows what might be needed to finish a task, and thus provides an informative context for retrieving more relevant knowledge which in turn helps generate a better output in the next iteration. Compared with recent work which interleaves retrieval with generation when producing an output, Iter-RetGen processes all retrieved knowledge as a whole and largely preserves the flexibility in generation without structural constraints. We evaluate Iter-RetGen on multi-hop question answering, fact verification, and commonsense reasoning, and show that it can flexibly leverage parametric knowledge and non-parametric knowledge, and is superior to or competitive with state-of-the-art retrieval-augmented baselines while causing fewer overheads of retrieval and generation. We can further improve performance via generation-augmented retrieval adaptation.

final answer, intermediate answer, knowledge, (12 more...)

arXiv.org Artificial Intelligence

2305.15294

Country:

Europe > Serbia > Central Serbia > Belgrade (0.06)
Asia > China > Beijing > Beijing (0.05)
Asia > Vietnam (0.05)
(15 more...)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Leisure & Entertainment > Sports (0.94)
Media (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Safe Control Synthesis with Uncertain Dynamics and Constraints

Long, Kehan, Dhiman, Vikas, Leok, Melvin, Cortés, Jorge, Atanasov, Nikolay

arXiv.org Artificial IntelligenceSep-30-2022

This paper considers safe control synthesis for dynamical systems with either probabilistic or worst-case uncertainty in both the dynamics model and the safety constraints. We formulate novel probabilistic and robust (worst-case) control Lyapunov function (CLF) and control barrier function (CBF) constraints that take into account the effect of uncertainty in either case. We show that either the probabilistic or the robust (worst-case) formulation leads to a second-order cone program (SOCP), which enables efficient safe and stable control synthesis. We evaluate our approach in PyBullet simulations of an autonomous robot navigating in unknown environments and compare the performance with a baseline CLF-CBF quadratic programming approach.

artificial intelligence, constraint, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2022.3182544

2202.09557

Country:

North America > United States > Maine > Penobscot County > Bangor (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Deep learning model trained on mobile phone-acquired frozen section images effectively detects basal cell carcinoma

Cao, Junli, S., B., Wu, Junyan, S., M., Zhang, Jing W., D., M., D., Ph., Ye, Jay J., D., M., D., Ph., Yu, Limin, D., M., S, M.

arXiv.org Artificial IntelligenceNov-22-2020

Background: Margin assessment of basal cell carcinoma using the frozen section is a common task of pathology intraoperative consultation. Although frequently straight-forward, the determination of the presence or absence of basal cell carcinoma on the tissue sections can sometimes be challenging. We explore if a deep learning model trained on mobile phone-acquired frozen section images can have adequate performance for future deployment. Materials and Methods: One thousand two hundred and forty-one (1241) images of frozen sections performed for basal cell carcinoma margin status were acquired using mobile phones. The photos were taken at 100x magnification (10x objective). The images were downscaled from a 4032 x 3024 pixel resolution to 576 x 432 pixel resolution. Semantic segmentation algorithm Deeplab V3 with Xception backbone was used for model training. Results: The model uses an image as input and produces a 2-dimensional black and white output of prediction of the same dimension; the areas determined to be basal cell carcinoma were displayed with white color, in a black background. Any output with the number of white pixels exceeding 0.5% of the total number of pixels is deemed positive for basal cell carcinoma. On the test set, the model achieves area under curve of 0.99 for receiver operator curve and 0.97 for precision-recall curve at the pixel level. The accuracy of classification at the slide level is 96%. Conclusions: The deep learning model trained with mobile phone images shows satisfactory performance characteristics, and thus demonstrates the potential for deploying as a mobile phone app to assist in frozen section interpretation in real time.

basal cell carcinoma, cell carcinoma, learning model, (12 more...)

arXiv.org Artificial Intelligence

2011.11081

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > South Carolina > Richland County > Columbia (0.04)
North America > United States > Michigan > Oakland County > Royal Oak (0.04)
North America > United States > Maine > Penobscot County > Bangor (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback