AITopics | Kunde, Jackson

Collaborating Authors

Kunde, Jackson

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

VersaPRM: Multi-Domain Process Reward Model via Synthetic Reasoning Data

Zeng, Thomas, Zhang, Shuibai, Wu, Shutong, Classen, Christian, Chae, Daewon, Ewer, Ethan, Lee, Minjae, Kim, Heeju, Kang, Wonjun, Kunde, Jackson, Fan, Ying, Kim, Jungtaek, Koo, Hyung Il, Ramchandran, Kannan, Papailiopoulos, Dimitris, Lee, Kangwook

arXiv.org Artificial IntelligenceFeb-10-2025

In particular, Outcome Reward Models (ORMs) are Process Reward Models (PRMs) have proven used to provide supervision based solely on the correctness effective at enhancing mathematical reasoning of the final outcome. However, ORMs fail to address errors for Large Language Models (LLMs) by leveraging in intermediate steps, limiting their effectiveness for increased inference-time computation. However, complex, multi-step reasoning tasks (Luo et al., 2024; Lightman they are predominantly trained on mathematical et al., 2024; Sun et al., 2024). Because ORMs suffer data and their generalizability to nonmathematical from this limitation, Process Reward Models (PRMs) have domains has not been rigorously been proposed to offer fine-grained, step-by-step feedback studied. In response, this work first shows that on the correctness of each reasoning step (Lightman et al., current PRMs have poor performance in other 2024; Uesato et al., 2022). PRMs have proven highly effective domains. To address this limitation, we introduce during inference, improving the reranking of generated VersaPRM, a multi-domain PRM trained solutions and guiding LLMs through search-based on synthetic reasoning data generated using our algorithms (Wan et al., 2024; Wang et al., 2024a).

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2502.06737

Country:

North America > United States > Wisconsin (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Multi-Bin Batching for Increasing LLM Inference Throughput

Guldogan, Ozgur, Kunde, Jackson, Lee, Kangwook, Pedarsani, Ramtin

arXiv.org Artificial IntelligenceDec-2-2024

Large Language Model (LLM) inference systems are becoming increasingly popular due to their various abilities, such as text generation (Li et al., 2024), coding assistance (Chen et al., 2021), and question answering (Jiang et al., 2021). As the demand for LLM inference systems grows, so does the need to optimize their efficiency. Several techniques have been proposed to improve the efficiency of LLM inference systems, and batched inference (Sheng et al., 2023; Kwon et al., 2023; Jin et al., 2023) is one of the most promising techniques among them. With batched inference, multiple requests are processed simultaneously, using the underlying hardware's parallelism to improve throughput. Figure 1(a) shows the measured throughput of the Phi-3.5 Mini Instruct model (Abdin et al., 2024) for various batch sizes on an NVIDIA A100 80G GPU. Throughput is calculated as the number of total tokens generated across all requests divided by time. However, batched inference comes with some critical drawbacks. The execution time of each request depends on the number of tokens generated, which varies across requests. In standard batched inference systems, a computing unit remains locked until all requests in the batch are completed, leading to resource underutilization when requests within a batch have widely differing execution times.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.04504

Country: North America > United States > California (0.46)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback