AITopics

2405.16545

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > Taiwan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Seo, SeungWon, Lee, Junhyeok, Noh, SeongRae, Kang, HyeongYeop

LLM-Based Cooperative Agents using Information Relevance and Plan Validation

arXiv.org Artificial IntelligenceMay-26-2024

We address the challenge of multi-agent cooperation, where agents achieve a common goal by interacting with a 3D scene and cooperating with decentralized agents under complex partial observations. This involves managing communication costs and optimizing interaction trajectories in dynamic environments. Our research focuses on three primary limitations of existing cooperative agent systems. Firstly, current systems demonstrate inefficiency in managing acquired information through observation, resulting in declining planning performance as the environment becomes more complex with additional objects or goals. Secondly, the neglect of false plans in partially observable settings leads to suboptimal cooperative performance, as agents struggle to adapt to environmental changes influenced by the unseen actions of other agents. Lastly, the failure to incorporate spatial data into decision-making processes restricts the agent's ability to construct optimized trajectories. To overcome these limitations, we propose the RElevance and Validation-Enhanced Cooperative Language Agent (REVECA), a novel cognitive architecture powered by GPT-3.5. REVECA leverages relevance assessment, plan validation, and spatial information to enhance the efficiency and robustness of agent cooperation in dynamic and partially observable environments while minimizing continuous communication costs and effectively managing irrelevant dummy objects. Our extensive experiments demonstrate the superiority of REVECA over previous approaches, including those driven by GPT-4.0. Additionally, a user study highlights REVECA's potential for achieving trustworthy human-AI cooperation. We expect that REVECA will have significant applications in gaming, XR applications, educational tools, and humanoid robots, contributing to substantial economic, commercial, and academic advancements.

agent, information, reveca, (13 more...)

2405.16751

Country: Asia > South Korea (0.04)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Ghosh, Soumyadip, Lu, Yingdong, Nowicki, Tomasz

On Convergence of the Alternating Directions SGHMC Algorithm

arXiv.org Artificial IntelligenceMay-26-2024

We study convergence rates of Hamiltonian Monte Carlo (HMC) algorithms with leapfrog integration under mild conditions on stochastic gradient oracle for the target distribution (SGHMC). Our method extends standard HMC by allowing the use of general auxiliary distributions, which is achieved by a novel procedure of Alternating Directions. The convergence analysis is based on the investigations of the Dirichlet forms associated with the underlying Markov chain driving the algorithms. For this purpose, we provide a detailed analysis on the error of the leapfrog integrator for Hamiltonian motions with both the kinetic and potential energy functions in general form. We characterize the explicit dependence of the convergence rates on key parameters such as the problem dimension, functional properties of both the target and auxiliary distributions, and the quality of the oracle.

algorithm, auxiliary distribution, markov chain, (14 more...)

2405.1314

Country:

North America > United States > New York (0.04)
Europe > Romania > Sud-Est Development Region > Constanța County > Constanța (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.37)

Limmer, Yannick, Kratsios, Anastasis, Yang, Xuwei, Saqur, Raeid, Horvath, Blanka

Reality Only Happens Once: Single-Path Generalization Bounds for Transformers

arXiv.org Machine LearningMay-26-2024

One of the inherent challenges in deploying transformers on time series is that \emph{reality only happens once}; namely, one typically only has access to a single trajectory of the data-generating process comprised of non-i.i.d. observations. We derive non-asymptotic statistical guarantees in this setting through bounds on the \textit{generalization} of a transformer network at a future-time $t$, given that it has been trained using $N\le t$ observations from a single perturbed trajectory of a Markov process. Under the assumption that the Markov process satisfies a log-Sobolev inequality, we obtain a generalization bound which effectively converges at the rate of ${O}(1/\sqrt{N})$. Our bound depends explicitly on the activation function ($\operatorname{Swish}$, $\operatorname{GeLU}$, or $\tanh$ are considered), the number of self-attention heads, depth, width, and norm-bounds defining the transformer architecture. Our bound consists of three components: (I) The first quantifies the gap between the stationary distribution of the data-generating Markov process and its distribution at time $t$, this term converges exponentially to $0$. (II) The next term encodes the complexity of the transformer model and, given enough time, eventually converges to $0$ at the rate ${O}(\log(N)^r/\sqrt{N})$ for any $r>0$. (III) The third term guarantees that the bound holds with probability at least $1$-$\delta$, and converges at a rate of ${O}(\sqrt{\log(1/\delta)}/\sqrt{N})$.

def, generalization, transformer, (15 more...)

arXiv.org Machine Learning

2405.16563

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)
North America > Canada > Ontario > Hamilton (0.04)
(8 more...)

Genre: Research Report (0.50)

Industry: Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.75)

Tkachuk, Volodymyr, Weisz, Gellért, Szepesvári, Csaba

Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear $q^\pi$-Realizability and Concentrability

arXiv.org Machine LearningMay-26-2024

We consider offline reinforcement learning (RL) in $H$-horizon Markov decision processes (MDPs) under the linear $q^\pi$-realizability assumption, where the action-value function of every policy is linear with respect to a given $d$-dimensional feature function. The hope in this setting is that learning a good policy will be possible without requiring a sample size that scales with the number of states in the MDP. Foster et al. [2021] have shown this to be impossible even under $\textit{concentrability}$, a data coverage assumption where a coefficient $C_\text{conc}$ bounds the extent to which the state-action distribution of any policy can veer off the data distribution. However, the data in this previous work was in the form of a sequence of individual transitions. This leaves open the question of whether the negative result mentioned could be overcome if the data was composed of sequences of full trajectories. In this work we answer this question positively by proving that with trajectory data, a dataset of size $\text{poly}(d,H,C_\text{conc})/\epsilon^2$ is sufficient for deriving an $\epsilon$-optimal policy, regardless of the size of the state space. The main tool that makes this result possible is due to Weisz et al. [2023], who demonstrate that linear MDPs can be used to approximate linearly $q^\pi$-realizable MDPs. The connection to trajectory data is that the linear MDP approximation relies on "skipping" over certain states. The associated estimation problems are thus easy when working with trajectory data, while they remain nontrivial when working with individual transitions. The question of computational efficiency under our assumptions remains open.

optimization problem 1, probability, traj, (14 more...)

arXiv.org Machine Learning

2405.16809

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Weinzierl, Sven, Zilker, Sandra, Dunzer, Sebastian, Matzner, Martin

Machine learning in business process management: A systematic literature review

Machine learning (ML) provides algorithms to create computer programs based on data without explicitly programming them. In business process management (BPM), ML applications are used to analyse and improve processes efficiently. Three frequent examples of using ML are providing decision support through predictions, discovering accurate process models, and improving resource allocation. This paper organises the body of knowledge on ML in BPM. We extract BPM tasks from different literature streams, summarise them under the phases of a process`s lifecycle, explain how ML helps perform these tasks and identify technical commonalities in ML implementations across tasks. This study is the first exhaustive review of how ML has been used in BPM. We hope that it can open the door for a new era of cumulative research by helping researchers to identify relevant preliminary work and then combine and further develop existing approaches in a focused fashion. Our paper helps managers and consultants to find ML applications that are relevant in the current project phase of a BPM initiative, like redesigning a business process. We also offer - as a synthesis of our review - a research agenda that spreads ten avenues for future research, including applying novel ML concepts like federated learning, addressing less regarded BPM lifecycle phases like process identification, and delivering ML applications with a focus on end-users.

information system, literature review, process mining, (12 more...)

2405.16396

Country:

Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.14)
South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Information Technology (1.00)
Banking & Finance (0.92)
Education (0.67)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(14 more...)

Variational Offline Multi-agent Skill Discovery

Chen, Jiayu, Ganguly, Bhargav, Lan, Tian, Aggarwal, Vaneet

Skills are effective temporal abstractions established for sequential decision making tasks, which enable efficient hierarchical learning for long-horizon tasks and facilitate multi-task learning through their transferability. Despite extensive research, research gaps remain in multi-agent scenarios, particularly for automatically extracting subgroup coordination patterns in a multi-agent task. In this case, we propose two novel auto-encoder schemes: VO-MASD-3D and VO-MASD-Hier, to simultaneously capture subgroup- and temporal-level abstractions and form multi-agent skills, which firstly solves the aforementioned challenge. An essential algorithm component of these schemes is a dynamic grouping function that can automatically detect latent subgroups based on agent interactions in a task. Notably, our method can be applied to offline multi-task data, and the discovered subgroup skills can be transferred across relevant tasks without retraining. Empirical evaluations on StarCraft tasks indicate that our approach significantly outperforms existing methods regarding applying skills in multi-agent reinforcement learning (MARL). Moreover, skills discovered using our method can effectively reduce the learning difficulty in MARL scenarios with delayed and sparse reward signals.

agent, algorithm, multi-agent skill, (14 more...)

2405.16386

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
North America > United States > District of Columbia > Washington (0.04)

Genre:

Research Report (1.00)
Overview (0.93)

Industry:

Education (0.46)
Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning

Chen, Minghao, Li, Yihang, Yang, Yanting, Yu, Shiyu, Lin, Binbin, He, Xiaofei

Large Language Models (LLM) based agents have shown promise in autonomously completing tasks across various domains, e.g., robotics, games, and web navigation. However, these agents typically require elaborate design and expert prompts to solve tasks in specific domains, which limits their adaptability. We introduce AutoManual, a framework enabling LLM agents to autonomously build their understanding through interaction and adapt to new environments. AutoManual categorizes environmental knowledge into diverse rules and optimizes them in an online fashion by two agents: 1) The Planner codes actionable plans based on current rules for interacting with the environment. 2) The Builder updates the rules through a well-structured rule system that facilitates online rule management and essential detail retention. To mitigate hallucinations in managing rules, we introduce \textit{case-conditioned prompting} strategy for the Builder. Finally, the Formulator agent compiles these rules into a comprehensive manual. The self-generated manual can not only improve the adaptability but also guide the planning of smaller LLMs while being human-readable. Given only one simple demonstration, AutoManual significantly improves task success rates, achieving 97.4\% with GPT-4-turbo and 86.2\% with GPT-3.5-turbo on ALFWorld benchmark tasks. The source code will be available soon.

agent, mug, receptacle, (14 more...)

2405.16247

Country:

Asia > China > Zhejiang Province > Ningbo (0.14)
Asia > Japan > Shikoku > Kagawa Prefecture > Takamatsu (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre:

Workflow (0.49)
Research Report (0.40)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.45)

Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers

Gu, Jiuxiang, Liang, Yingyu, Shi, Zhenmei, Song, Zhao, Zhou, Yufa

Tensor Attention, a multi-view attention that is able to capture high-order correlations among multiple modalities, can overcome the representational limitations of classical matrix attention. However, the $\Omega(n^3)$ time complexity of tensor attention poses a significant obstacle to its practical implementation in transformers, where $n$ is the input sequence length. In this work, we prove that the backward gradient of tensor attention training can be computed in almost linear $n^{1+o(1)}$ time, the same complexity as its forward computation under a bounded entries assumption. We provide a closed-form solution for the gradient and propose a fast computation method utilizing polynomial approximation methods and tensor algebraic tricks. Furthermore, we prove the necessity and tightness of our assumption through hardness analysis, showing that slightly weakening it renders the gradient problem unsolvable in truly subcubic time. Our theoretical results establish the feasibility of efficient higher-order transformer training and may facilitate practical applications of tensor attention architectures.

computation, definition 3, step follow, (13 more...)

2405.16411

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)
(9 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

Bramblett, Daniel, Srivastava, Siddharth

Belief-State Query Policies for Planning With Preferences Under Partial Observability

arXiv.org Artificial IntelligenceMay-24-2024

Planning in real-world settings often entails addressing partial observability while aligning with users' preferences. We present a novel framework for expressing users' preferences about agent behavior in a partially observable setting using parameterized belief-state query (BSQ) preferences in the setting of goal-oriented partially observable Markov decision processes (gPOMDPs). We present the first formal analysis of such preferences and prove that while the expected value of a BSQ preference is not a convex function w.r.t its parameters, it is piecewise constant and yields an implicit discrete parameter search space that is finite for finite horizons. This theoretical result leads to novel algorithms that optimize gPOMDP agent behavior while guaranteeing user preference compliance. Theoretical analysis proves that our algorithms converge to the optimal preference-compliant behavior in the limit. Empirical results show that BSQ preferences provide a computationally feasible approach for planning with preferences in partially observable settings.

artificial intelligence, belief revision, machine learning, (18 more...)

2405.15907

Country:

North America > United States > Virginia > Arlington County > Arlington (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Arizona (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Belief Revision (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.86)