AITopics

2410.20496

Country: North America > United States > Michigan (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.94)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceJun-11-2024

Assistance-Seeking in Human-Supervised Autonomy: Role of Trust and Secondary Task Engagement (Extended Version)

Mangalindan, Dong Hae, Srivastava, Vaibhav

Using a dual-task paradigm, we explore how robot actions, performance, and the introduction of a secondary task influence human trust and engagement. In our study, a human supervisor simultaneously engages in a target-tracking task while supervising a mobile manipulator performing an object collection task. The robot can either autonomously collect the object or ask for human assistance. The human supervisor also has the choice to rely upon or interrupt the robot. Using data from initial experiments, we model the dynamics of human trust and engagement using a linear dynamical system (LDS). Furthermore, we develop a human action model to define the probability of human reliance on the robot. Our model suggests that participants are more likely to interrupt the robot when their trust and engagement are low during high-complexity collection tasks. Using Model Predictive Control (MPC), we design an optimal assistance-seeking policy. Evaluation experiments demonstrate the superior performance of the MPC policy over the baseline policy for most participants.

artificial intelligence, machine learning, robot, (18 more...)

2405.20118

Country: North America > United States > Michigan > Ingham County (0.14)

Genre: Research Report > New Finding (0.88)

Industry:

Energy > Oil & Gas (0.77)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

arXiv.org Artificial IntelligenceOct-9-2023

On Multi-Fidelity Impedance Tuning for Human-Robot Cooperative Manipulation

Lau, Ethan, Srivastava, Vaibhav, Bopardikar, Shaunak D.

We examine how a human-robot interaction (HRI) system may be designed when input-output data from previous experiments are available. In particular, we consider how to select an optimal impedance in the assistance design for a cooperative manipulation task with a new operator. Due to the variability between individuals, the design parameters that best suit one operator of the robot may not be the best parameters for another one. However, by incorporating historical data using a linear auto-regressive (AR-1) Gaussian process, the search for a new operator's optimal parameters can be accelerated. We lay out a framework for optimizing the human-robot cooperative manipulation that only requires input-output data. We establish how the AR-1 model improves the bound on the regret and numerically simulate a human-robot cooperative manipulation task to show the regret improvement. Further, we show how our approach's input-output nature provides robustness against modeling error through an additional numerical study.

artificial intelligence, manipulator, operator, (17 more...)

2310.05904

Country:

North America > United States (0.14)
Europe (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Robots > Humanoid Robots (1.00)

arXiv.org Artificial IntelligenceDec-19-2022

Deterministic Sequencing of Exploration and Exploitation for Reinforcement Learning

Gupta, Piyush, Srivastava, Vaibhav

We propose Deterministic Sequencing of Exploration and Exploitation (DSEE) algorithm with interleaving exploration and exploitation epochs for model-based RL problems that aim to simultaneously learn the system model, i.e., a Markov decision process (MDP), and the associated optimal policy. During exploration, DSEE explores the environment and updates the estimates for expected reward and transition probabilities. During exploitation, the latest estimates of the expected reward and transition probabilities are used to obtain a robust policy with high probability. We design the lengths of the exploration and exploitation epochs such that the cumulative regret grows as a sub-linear function of time.

data mining, machine learning, reinforcement learning, (17 more...)

2209.05408

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

arXiv.org Machine LearningJan-12-2021

Regret Analysis of Distributed Gaussian Process Estimation and Coverage

Wei, Lai, McDonald, Andrew, Srivastava, Vaibhav

We study the problem of distributed multi-robot coverage over an unknown, nonuniform sensory field. Modeling the sensory field as a realization of a Gaussian Process and using Bayesian techniques, we devise a policy which aims to balance the tradeoff between learning the sensory function and covering the environment. We propose an adaptive coverage algorithm called Deterministic Sequencing of Learning and Coverage (DSLC) that schedules learning and coverage epochs such that its emphasis gradually shifts from exploration to exploitation while never fully ceasing to learn. Using a novel definition of coverage regret which characterizes overall coverage performance of a multi-robot team over a time horizon $T$, we analyze DSLC to provide an upper bound on expected cumulative coverage regret. Finally, we illustrate the empirical performance of the algorithm through simulations of the coverage task over an unknown distribution of wildfires.

algorithm, artificial intelligence, machine learning, (19 more...)

2101.04306

Country: North America > United States > Michigan > Ingham County (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

arXiv.org Machine LearningJul-20-2020

Minimax Policy for Heavy-tailed Multi-armed Bandits

Wei, Lai, Srivastava, Vaibhav

In these contexts, exploration means learning In the worst-case setting, the lower and upper bounds the environment while exploitation means taking empirically are distribution-free. Assuming the rewards are bounded, computed best actions. When finite time performance is concerned, Audibert and Bubeck [6] establish a Ω( KT) lower bound i.e., scenarios in which one cannot learn indefinitely, on the minimax regret. They also studied a modified UCB ensuring a good balance of exploration and exploitation is algorithm called Minimax Optimal Strategy in the Stochastic the key to a good performance. MAB and its variations are case (MOSS) and proved that it achieves an order-optimal prototypical models for these problems, and they are widely worst-case regret while maintaining a logarithm distributiondependent used in many areas such as network routing, recommendation regret. Degenne and Perchet [7] extend MOSS to systems and resource allocation; see [1, Chapter 1]. an anytime version called MOSSanytime. The stochastic MAB problem was originally proposed by The rewards being bounded or sub-Gaussian is a common Robbins [2]. In this problem, at each time, an agent chooses assumption that gives sample mean an exponential an arm from a set of K arms and receives the associated convergence and simplifies the MAB problem.

algorithm, artificial intelligence, big data, (20 more...)

2007.10493

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.87)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.82)
Information Technology > Data Science > Data Mining > Big Data (0.64)

arXiv.org Machine LearningMay-17-2020

Expedited Multi-Target Search with Guaranteed Performance via Multi-fidelity Gaussian Processes

Wei, Lai, Tan, Xiaobo, Srivastava, Vaibhav

We consider a scenario in which an autonomous vehicle equipped with a downward facing camera operates in a 3D environment and is tasked with searching for an unknown number of stationary targets on the 2D floor of the environment. The key challenge is to minimize the search time while ensuring a high detection accuracy. We model the sensing field using a multi-fidelity Gaussian process that systematically describes the sensing information available at different altitudes from the floor. Based on the sensing model, we design a novel algorithm called Expedited Multi-Target Search (EMTS) that (i) addresses the coverage-accuracy trade-off: sampling at locations farther from the floor provides wider field of view but less accurate measurements, (ii) computes an occupancy map of the floor within a prescribed accuracy and quickly eliminates unoccupied regions from the search space, and (iii) travels efficiently to collect the required samples for target detection. We rigorously analyze the algorithm and establish formal guarantees on the target detection accuracy and the expected detection time. We illustrate the algorithm using a simulated multi-target search scenario.

big data, fidelity level, ground transportation, (20 more...)

2005.08434

Country:

Europe (0.68)
North America > United States > Michigan > Ingham County (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.34)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.34)

arXiv.org Machine LearningDec-12-2018

On Distributed Multi-player Multiarmed Bandit Problems in Abruptly Changing Environment

Wei, Lai, Srivastava, Vaibhav

We study the multi-player stochastic multiarmed bandit (MAB) problem in an abruptly changing environment. We consider a collision model in which a player receives reward at an arm if it is the only player to select the arm. We design two novel algorithms, namely, Round-Robin Sliding-Window Upper Confidence Bound\# (RR-SW-UCB\#), and the Sliding-Window Distributed Learning with Prioritization (SW-DLP). We rigorously analyze these algorithms and show that the expected cumulative group regret for these algorithms is upper bounded by sublinear functions of time, i.e., the time average of the regret asymptotically converges to zero. We complement our analytic results with numerical illustrations.

algorithm, artificial intelligence, big data, (16 more...)

1812.05165

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.66)

arXiv.org Machine LearningFeb-22-2018

On Abruptly-Changing and Slowly-Varying Multiarmed Bandit Problems

Wei, Lai, Srivastava, Vaibhav

We study the non-stationary stochastic multiarmed bandit (MAB) problem and propose two generic algorithms, namely, the limited memory deterministic sequencing of exploration and exploitation (LM-DSEE) and the Sliding-Window Upper Confidence Bound# (SW-UCB#). We rigorously analyze these algorithms in abruptly-changing and slowly-varying environments and characterize their performance. We show that the expected cumulative regret for these algorithms under either of the environments is upper bounded by sublinear functions of time, i.e., the time average of the regret asymptotically converges to zero. We complement our analytic results with numerical illustrations.

algorithm, health & medicine, upstream oil & gas, (20 more...)

1802.0838

Country: North America > United States > Michigan > Ingham County (0.14)

Genre: Research Report (0.40)

Industry:

Energy > Oil & Gas > Upstream (0.49)
Health & Medicine (0.40)

Technology:

Information Technology > Data Science > Data Mining > Big Data (0.65)
Information Technology > Artificial Intelligence > Machine Learning (0.46)

arXiv.org Machine LearningMay-16-2016

On Distributed Cooperative Decision-Making in Multiarmed Bandits

Landgren, Peter, Srivastava, Vaibhav, Leonard, Naomi Ehrich

We study the explore-exploit tradeoff in distributed cooperative decision-making using the context of the multiarmed bandit (MAB) problem. For the distributed cooperative MAB problem, we design the cooperative UCB algorithm that comprises two interleaved distributed processes: (i) running consensus algorithms for estimation of rewards, and (ii) upper-confidence-bound-based heuristics for selection of arms. We rigorously analyze the performance of the cooperative UCB algorithm and characterize the influence of communication graph structure on the decision-making performance of the group.

algorithm, artificial intelligence, mab problem, (16 more...)

1512.06888

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Communications > Networks (0.68)