AITopics

2402.1476

Country:

North America > United States (0.14)
Pacific Ocean (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

UniTST: Effectively Modeling Inter-Series and Intra-Series Dependencies for Multivariate Time Series Forecasting

Liu, Juncheng, Liu, Chenghao, Woo, Gerald, Wang, Yiwei, Hooi, Bryan, Xiong, Caiming, Sahoo, Doyen

Transformer-based models have emerged as powerful tools for multivariate time series forecasting (MTSF). However, existing Transformer models often fall short of capturing both intricate dependencies across variate and temporal dimensions in MTS data. Some recent models are proposed to separately capture variate and temporal dependencies through either two sequential or parallel attention mechanisms. However, these methods cannot directly and explicitly learn the intricate inter-series and intra-series dependencies. In this work, we first demonstrate that these dependencies are very important as they usually exist in real-world data. To directly model these dependencies, we propose a transformer-based model UniTST containing a unified attention mechanism on the flattened patch tokens. Additionally, we add a dispatcher module which reduces the complexity and makes the model feasible for a potentially large number of variates. Although our proposed model employs a simple architecture, it offers compelling performance as shown in our extensive experiments on several datasets for time series forecasting.

dependency, forecasting, variate, (16 more...)

2406.04975

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)
Asia > Singapore (0.04)

Genre: Research Report (0.82)

Industry:

Energy (0.69)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Li, Harry, Appleby, Gabriel, Suh, Ashley

LinkQ: An LLM-Assisted Visual Interface for Knowledge Graph Question-Answering

We present LinkQ, a system that leverages a large language model (LLM) to facilitate knowledge graph (KG) query construction through natural language question-answering. Traditional approaches often require detailed knowledge of complex graph querying languages, limiting the ability for users -- even experts -- to acquire valuable insights from KG data. LinkQ simplifies this process by first interpreting a user's question, then converting it into a well-formed KG query. By using the LLM to construct a query instead of directly answering the user's question, LinkQ guards against the LLM hallucinating or generating false, erroneous information. By integrating an LLM into LinkQ, users are able to conduct both exploratory and confirmatory data analysis, with the LLM helping to iteratively refine open-ended questions into precise ones. To demonstrate the efficacy of LinkQ, we conducted a qualitative study with five KG practitioners and distill their feedback. Our results indicate that practitioners find LinkQ effective for KG question-answering, and desire future LLM-assisted systems for the exploratory analysis of graph databases.

linkq, llm, query, (13 more...)

2406.06621

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Pacific Ocean (0.04)
(2 more...)

Genre: Research Report (0.84)

Industry: Leisure & Entertainment > Sports (0.30)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.32)

Are We Done with MMLU?

Gema, Aryo Pradipta, Leang, Joshua Ong Jun, Hong, Giwon, Devoto, Alessio, Mancino, Alberto Carlo Maria, Saxena, Rohit, He, Xuanli, Zhao, Yu, Du, Xiaotang, Madani, Mohammad Reza Ghasemi, Barale, Claire, McHardy, Robert, Harris, Joshua, Kaddour, Jean, van Krieken, Emile, Minervini, Pasquale

We identify and analyse errors in the popular Massive Multitask Language Understanding (MMLU) benchmark. Even though MMLU is widely adopted, our analysis demonstrates numerous ground truth errors that obscure the true capabilities of LLMs. For example, we find that 57% of the analysed questions in the Virology subset contain errors. To address this issue, we introduce a comprehensive framework for identifying dataset errors using a novel error taxonomy. Then, we create MMLU-Redux, which is a subset of 3,000 manually re-annotated questions across 30 MMLU subjects. Using MMLU-Redux, we demonstrate significant discrepancies with the model performance metrics that were originally reported. Our results strongly advocate for revising MMLU's error-ridden questions to enhance its future utility and reliability as a benchmark.

dataset, mmlu, mmlu-redux, (13 more...)

2406.04127

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Spain > Galicia > Madrid (0.04)
Africa > West Africa (0.04)
(10 more...)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Immunology (0.68)
Education > Curriculum > Subject-Specific Education (0.48)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning

Tong, Yongqi, Li, Dawei, Wang, Sizhe, Wang, Yujia, Teng, Fei, Shang, Jingbo

Recent works have shown the benefits to LLMs from fine-tuning golden-standard Chain-of-Thought (CoT) rationales or using them as correct examples in few-shot prompting. While humans can indeed imitate correct examples, learning from our mistakes is another vital aspect of human cognition. Hence, a question naturally arises: \textit{can LLMs learn and benefit from their mistakes, especially for their reasoning? } This study investigates this problem from both the prompting and model-tuning perspectives. We begin by introducing \textsc{CoTErrorSet}, a new benchmark with 609,432 questions, each designed with both correct and error references, and demonstrating the types and reasons for making such mistakes. To explore the effectiveness of those mistakes, we design two methods: (1) \textbf{Self-rethinking} prompting guides LLMs to rethink whether they have made similar previous mistakes; and (2) \textbf{Mistake tuning} involves finetuning models in both correct and incorrect reasoning domains, rather than only tuning models to learn ground truth in traditional methodology. We conduct a series of experiments to prove LLMs can obtain benefits from mistakes in both directions. Our two methods offer potentially cost-effective strategies by leveraging errors to enhance reasoning capabilities, which costs significantly less than creating meticulously hand-crafted golden references. We ultimately make a thorough analysis of the reasons behind LLMs' errors, which provides directions that future research needs to overcome. \textsc{CoTErrorSet} will be published soon on \texttt{\url{https://github.com/YookiTong/Learn-from-Mistakes-CotErrorSet}}.

preprint arxiv, rationale, reasoning, (13 more...)

2403.20046

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.05)
Pacific Ocean (0.04)
Europe (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceJun-6-2024

A + B: A General Generator-Reader Framework for Optimizing LLMs to Unleash Synergy Potential

Tang, Wei, Cao, Yixin, Ying, Jiahao, Wang, Bo, Zhao, Yuyue, Liao, Yong, Zhou, Pengyuan

Retrieval-Augmented Generation (RAG) is an effective solution to supplement necessary knowledge to large language models (LLMs). Targeting its bottleneck of retriever performance, "generate-then-read" pipeline is proposed to replace the retrieval stage with generation from the LLM itself. Although promising, this research direction is underexplored and still cannot work in the scenario when source knowledge is given. In this paper, we formalize a general "A + B" framework with varying combinations of foundation models and types for systematic investigation. We explore the efficacy of the base and chat versions of LLMs and found their different functionalities suitable for generator A and reader B, respectively. Their combinations consistently outperform single models, especially in complex scenarios. Furthermore, we extend the application of the "A + B" framework to scenarios involving source documents through continuous learning, enabling the direct integration of external knowledge into LLMs. This approach not only facilitates effective acquisition of new knowledge but also addresses the challenges of safety and helpfulness post-adaptation. The paper underscores the versatility of the "A + B" framework, demonstrating its potential to enhance the practical application of LLMs across various domains.

knowledge, llm, scenario, (15 more...)

2406.03963

Country:

Asia > China > Beijing > Beijing (0.04)
Pacific Ocean (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-6-2024

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Zheng, Huaixiu Steven, Mishra, Swaroop, Zhang, Hugh, Chen, Xinyun, Chen, Minmin, Nova, Azade, Hou, Le, Cheng, Heng-Tze, Le, Quoc V., Chi, Ed H., Zhou, Denny

We introduce NATURAL PLAN, a realistic planning benchmark in natural language containing 3 key tasks: Trip Planning, Meeting Planning, and Calendar Scheduling. We focus our evaluation on the planning capabilities of LLMs with full information on the task, by providing outputs from tools such as Google Flights, Google Maps, and Google Calendar as contexts to the models. This eliminates the need for a tool-use environment for evaluating LLMs on Planning. We observe that NATURAL PLAN is a challenging benchmark for state of the art models. For example, in Trip Planning, GPT-4 and Gemini 1.5 Pro could only achieve 31.1% and 34.8% solve rate respectively. We find that model performance drops drastically as the complexity of the problem increases: all models perform below 5% when there are 10 cities, highlighting a significant gap in planning in natural language for SoTA LLMs. We also conduct extensive ablation studies on NATURAL PLAN to further shed light on the (in)effectiveness of approaches such as self-correction, few-shot generalization, and in-context planning with long-contexts on improving LLM planning.

barcelona, florence, trip planning, (12 more...)

2406.0452

Country:

Europe > Finland > Uusimaa > Helsinki (0.08)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.05)
Pacific Ocean > North Pacific Ocean > San Francisco Bay > Golden Gate (0.05)
(3 more...)

Genre: Research Report (0.84)

Industry: Consumer Products & Services > Travel (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJun-6-2024

Evaluating Quantized Large Language Models

Li, Shiyao, Ning, Xuefei, Wang, Luning, Liu, Tengxuan, Shi, Xiangsheng, Yan, Shengen, Dai, Guohao, Yang, Huazhong, Wang, Yu

Post-training quantization (PTQ) has emerged as a promising technique to reduce the cost of large language models (LLMs). Specifically, PTQ can effectively mitigate memory consumption and reduce computational overhead in LLMs. To meet the requirements of both high efficiency and performance across diverse scenarios, a comprehensive evaluation of quantized LLMs is essential to guide the selection of quantization methods. This paper presents a thorough evaluation of these factors by evaluating the effect of PTQ on Weight, Activation, and KV Cache on 11 model families, including OPT, LLaMA2, Falcon, Bloomz, Mistral, ChatGLM, Vicuna, LongChat, StableLM, Gemma, and Mamba, with parameters ranging from 125M to 180B. The evaluation encompasses five types of tasks: basic NLP, emergent ability, trustworthiness, dialogue, and long-context tasks. Moreover, we also evaluate the state-of-the-art (SOTA) quantization methods to demonstrate their applicability. Based on the extensive experiments, we systematically summarize the effect of quantization, provide recommendations to apply quantization techniques, and point out future directions. The code can be found in https://github.com/thu-nics/qllm-eval.

dataset, quant, quantization, (12 more...)

2402.18158

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > China > Shanghai > Shanghai (0.04)
(4 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Promising Solution (0.47)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Consumer Products & Services > Travel (1.00)
Education > Educational Setting > K-12 Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Calderon, Josuan, Berman, Gordon J.

Inferring the time-varying coupling of dynamical systems with temporal convolutional autoencoders

arXiv.org Artificial IntelligenceJun-5-2024

Most approaches for assessing causality in complex dynamical systems fail when the interactions between variables are inherently non-linear and non-stationary. Here we introduce Temporal Autoencoders for Causal Inference (TACI), a methodology that combines a new surrogate data metric for assessing causal interactions with a novel two-headed machine learning architecture to identify and measure the direction and strength of time-varying causal interactions. Through tests on both synthetic and real-world datasets, we demonstrate TACI's ability to accurately quantify dynamic causal interactions across a variety of systems. Our findings display the method's effectiveness compared to existing approaches and also highlight our approach's potential to build a deeper understanding of the mechanisms that underlie time-varying interactions in physical and biological systems.

artificial intelligence, interaction, machine learning, (17 more...)

2406.03212

Country:

North America > United States > Georgia > Fulton County > Atlanta (0.04)
Pacific Ocean > North Pacific Ocean (0.04)
North America > Canada (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Energy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Seong, Jihyeon, Oh, Sekwang, Choi, Jaesik

Towards Dynamic Trend Filtering through Trend Point Detection with Reinforcement Learning

arXiv.org Artificial IntelligenceJun-5-2024

Trend filtering simplifies complex time series data by applying smoothness to filter out noise while emphasizing proximity to the original data. However, existing trend filtering methods fail to reflect abrupt changes in the trend due to `approximateness,' resulting in constant smoothness. This approximateness uniformly filters out the tail distribution of time series data, characterized by extreme values, including both abrupt changes and noise. In this paper, we propose Trend Point Detection formulated as a Markov Decision Process (MDP), a novel approach to identifying essential points that should be reflected in the trend, departing from approximations. We term these essential points as Dynamic Trend Points (DTPs) and extract trends by interpolating them. To identify DTPs, we utilize Reinforcement Learning (RL) within a discrete action space and a forecasting sum-of-squares loss function as a reward, referred to as the Dynamic Trend Filtering network (DTF-net). DTF-net integrates flexible noise filtering, preserving critical original subsequences while removing noise as required for other subsequences. We demonstrate that DTF-net excels at capturing abrupt changes compared to other trend filtering algorithms and enhances forecasting performance, as abrupt changes are predicted rather than smoothed out.

abrupt change, dataset, dtf-net, (13 more...)

2406.03665

Country:

Asia > South Korea (0.04)
Pacific Ocean > North Pacific Ocean > San Francisco Bay (0.04)
North America > United States > California > San Francisco County > San Francisco (0.04)

Genre: Research Report (0.70)

Industry:

Banking & Finance (0.93)
Leisure & Entertainment (0.92)
Government (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)