AITopics

2207.01567

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Japan (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > Singapore (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceOct-5-2022

Efficient Policy Iteration for Robust Markov Decision Processes via Regularization

Kumar, Navdeep, Levy, Kfir, Wang, Kaixin, Mannor, Shie

Robust Markov decision processes (MDPs) provide a general framework to model decision problems where the system dynamics are changing or only partially known. Efficient methods for some \texttt{sa}-rectangular robust MDPs exist, using its equivalence with reward regularized MDPs, generalizable to online settings. In comparison to \texttt{sa}-rectangular robust MDPs, \texttt{s}-rectangular robust MDPs are less restrictive but much more difficult to deal with. Interestingly, recent works have established the equivalence between \texttt{s}-rectangular robust MDPs and policy regularized MDPs. But we don't have a clear understanding to exploit this equivalence, to do policy improvement steps to get the optimal value function or policy. We don't have a clear understanding of greedy/optimal policy except it can be stochastic. There exist no methods that can naturally be generalized to model-free settings. We show a clear and explicit equivalence between \texttt{s}-rectangular $L_p$ robust MDPs and policy regularized MDPs that resemble very much policy entropy regularized MDPs widely used in practice. Further, we dig into the policy improvement step and concretely derive optimal robust Bellman operators for \texttt{s}-rectangular $L_p$ robust MDPs. We find that the greedy/optimal policies in \texttt{s}-rectangular $L_p$ robust MDPs are threshold policies that play top $k$ actions whose $Q$ value is greater than some threshold (value), proportional to the $(p-1)$th power of its advantage. In addition, we show time complexity of (\texttt{sa} and \texttt{s}-rectangular) $L_p$ robust MDPs is the same as non-robust MDPs up to some log factors. Our work greatly extends the existing understanding of \texttt{s}-rectangular robust MDPs and naturally generalizable to online settings.

artificial intelligence, machine learning, robust mdp, (15 more...)

2205.14327

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Singapore (0.04)
Asia > China (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.61)

Van Werde, Alexander, Senen-Cerda, Albert, Kosmella, Gianluca, Sanders, Jaron

Detection and Evaluation of Clusters within Sequential Data

arXiv.org Artificial IntelligenceOct-4-2022

Motivated by theoretical advancements in dimensionality reduction techniques we use a recent model, called Block Markov Chains, to conduct a practical study of clustering in real-world sequential data. Clustering algorithms for Block Markov Chains possess theoretical optimality guarantees and can be deployed in sparse data regimes. Despite these favorable theoretical properties, a thorough evaluation of these algorithms in realistic settings has been lacking. We address this issue and investigate the suitability of these clustering algorithms in exploratory data analysis of real-world sequential data. In particular, our sequential data is derived from human DNA, written text, animal movement data and financial markets. In order to evaluate the determined clusters, and the associated Block Markov Chain model, we further develop a set of evaluation tools. These tools include benchmarking, spectral noise analysis and statistical model selection tools. An efficient implementation of the clustering algorithm and the new evaluation tools is made available together with this paper. Practical challenges associated to real-world data are encountered and discussed. It is ultimately found that the Block Markov Chain model assumption, together with the tools developed here, can indeed produce meaningful insights in exploratory data analyses despite the complexity and sparsity of real-world data.

artificial intelligence, data mining, machine learning, (19 more...)

2210.01679

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Belomestny, Denis, Schoenmakers, John

Primal-dual regression approach for Markov decision processes with general state and action space

arXiv.org Machine LearningOct-4-2022

We develop a regression based primal-dual martingale approach for solving finite time horizon MDPs with general state and action space. As a result, our method allows for the construction of tight upper and lower biased approximations of the value functions, and, provides tight approximations to the optimal policy. In particular, we prove tight error bounds for the estimated duality gap featuring polynomial dependence on the time horizon, and sublinear dependence on the cardinality/dimension of the possibly infinite state and action space. From a computational point of view the proposed method is efficient since, in contrast to usual duality-based methods for optimal control problems in the literature, the Monte Carlo procedures here involved do not require nested simulations.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

2210.00258

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Ho, Qi Heng, Sunberg, Zachary N., Lahijanian, Morteza

Gaussian Belief Trees for Chance Constrained Asymptotically Optimal Motion Planning

arXiv.org Artificial IntelligenceOct-4-2022

In this paper, we address the problem of sampling-based motion planning under motion and measurement uncertainty with probabilistic guarantees. We generalize traditional sampling-based tree-based motion planning algorithms for deterministic systems and propose belief-$\mathcal{A}$, a framework that extends any kinodynamical tree-based planner to the belief space for linear (or linearizable) systems. We introduce appropriate sampling techniques and distance metrics for the belief space that preserve the probabilistic completeness and asymptotic optimality properties of the underlying planner. We demonstrate the efficacy of our approach for finding safe low-cost paths efficiently and asymptotically optimally in simulation, for both holonomic and non-holonomic systems.

artificial intelligence, belief space, machine learning, (16 more...)

doi: 10.1109/ICRA46639.2022.9812343

2202.12407

Country:

North America > United States > Colorado > Boulder County > Boulder (0.14)
Europe > Austria > Vienna (0.14)
North America > Canada > Quebec > Capitale-Nationale Region > Québec (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Kulkarni, Abhishek N., Fu, Jie

Opportunistic Qualitative Planning in Stochastic Systems with Incomplete Preferences over Reachability Objectives

arXiv.org Artificial IntelligenceOct-4-2022

Preferences play a key role in determining what goals/constraints to satisfy when not all constraints can be satisfied simultaneously. In this paper, we study how to synthesize preference satisfying plans in stochastic systems, modeled as an MDP, given a (possibly incomplete) combinative preference model over temporally extended goals. We start by introducing new semantics to interpret preferences over infinite plays of the stochastic system. Then, we introduce a new notion of improvement to enable comparison between two prefixes of an infinite play. Based on this, we define two solution concepts called safe and positively improving (SPI) and safe and almost-surely improving (SASI) that enforce improvements with a positive probability and with probability one, respectively. We construct a model called an improvement MDP, in which the synthesis of SPI and SASI strategies that guarantee at least one improvement reduces to computing positive and almost-sure winning strategies in an MDP. We present an algorithm to synthesize the SPI and SASI strategies that induce multiple sequential improvements. We demonstrate the proposed approach using a robot motion planning problem.

artificial intelligence, logic & formal reasoning, machine learning, (21 more...)

2210.01878

Country:

North America > United States > Florida > Alachua County > Gainesville (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)

Optimizing Data Collection for Machine Learning

Mahmood, Rafid, Lucas, James, Alvarez, Jose M., Fidler, Sanja, Law, Marc T.

Modern deep learning systems require huge data sets to achieve impressive performance, but there is little guidance on how much or what kind of data to collect. Over-collecting data incurs unnecessary present costs, while under-collecting may incur future costs and delay workflows. We propose a new paradigm for modeling the data collection workflow as a formal optimal data collection problem that allows designers to specify performance targets, collection costs, a time horizon, and penalties for failing to meet the targets. Additionally, this formulation generalizes to tasks requiring multiple data sources, such as labeled and unlabeled data used in semi-supervised learning. To solve our problem, we develop Learn-Optimize-Collect (LOC), which minimizes expected future collection costs. Finally, we numerically compare our framework to the conventional baseline of estimating data requirements by extrapolating from neural scaling laws. We significantly reduce the risks of failing to meet desired performance targets on several classification, segmentation, and detection tasks, while maintaining low total collection costs.

artificial intelligence, failure rate, machine learning, (14 more...)

2210.01234

Country: North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Skau, Erik, Hollis, Andrew, Eidenbenz, Stephan, Rasmussen, Kim, Alexandrov, Boian

Process Modeling, Hidden Markov Models, and Non-negative Tensor Factorization with Model Selection

Monitoring of industrial processes is a critical capability in industry and in government to ensure reliability of production cycles, quick emergency response, and national security. Process monitoring allows users to gauge the involvement of an organization in an industrial process or predict the degradation or aging of machine parts in processes taking place at a remote location. Similar to many data science applications, we usually only have access to limited raw data, such as satellite imagery, short video clips, some event logs, and signatures captured by a small set of sensors. To combat data scarcity, we leverage the knowledge of subject matter experts (SMEs) who are familiar with the process. Various process mining techniques have been developed for this type of analysis; typically such approaches combine theoretical process models built based on domain expert insights with ad-hoc integration of available pieces of raw data. Here, we introduce a novel mathematically sound method that integrates theoretical process models (as proposed by SMEs) with interrelated minimal Hidden Markov Models (HMM), built via non-negative tensor factorization and discrete model simulations. Our method consolidates: (a) Theoretical process models development, (b) Discrete model simulations (c) HMM, (d) Joint Non-negative Matrix Factorization (NMF) and Non-negative Tensor Factorization (NTF), and (e) Custom model selection. To demonstrate our methodology and its abilities, we apply it on simple synthetic and real world process models.

artificial intelligence, machine learning, process model, (14 more...)

2210.0106

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.05)
North America > United States > North Carolina (0.04)
North America > United States > New York (0.04)
Europe > Netherlands (0.04)

Genre: Research Report (0.64)

Industry:

Government > Regional Government (0.46)
Government > Military (0.34)
Materials > Metals & Mining (0.34)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Movement Analytics: Current Status, Application to Manufacturing, and Future Prospects from an AI Perspective

Baumgartner, Peter, Smith, Daniel, Rana, Mashud, Kapoor, Reena, Tartaglia, Elena, Schutt, Andreas, Rahman, Ashfaqur, Taylor, John, Dunstall, Simon

Data-driven decision making is becoming an integral part of manufacturing companies. Data is collected and commonly used to improve efficiency and produce high quality items for the customers. IoT-based and other forms of object tracking are an emerging tool for collecting movement data of objects/entities (e.g. human workers, moving vehicles, trolleys etc.) over space and time. Movement data can provide valuable insights like process bottlenecks, resource utilization, effective working time etc. that can be used for decision making and improving efficiency. Turning movement data into valuable information for industrial management and decision making requires analysis methods. We refer to this process as movement analytics. The purpose of this document is to review the current state of work for movement analytics both in manufacturing and more broadly. We survey relevant work from both a theoretical perspective and an application perspective. From the theoretical perspective, we put an emphasis on useful methods from two research areas: machine learning, and logic-based knowledge representation. We also review their combinations in view of movement analytics, and we discuss promising areas for future development and application. Furthermore, we touch on constraint optimization. From an application perspective, we review applications of these methods to movement analytics in a general sense and across various industries. We also describe currently available commercial off-the-shelf products for tracking in manufacturing, and we overview main concepts of digital twins and their applications.

logic & formal reasoning, machine learning, real time system, (29 more...)

2210.01344

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(31 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Transportation > Ground > Rail (1.00)
(8 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(18 more...)

Sample Complexity of Nonparametric Off-Policy Evaluation on Low-Dimensional Manifolds using Deep Networks

Ji, Xiang, Chen, Minshuo, Wang, Mengdi, Zhao, Tuo

We consider the off-policy evaluation problem of reinforcement learning using deep convolutional neural networks. We analyze the deep fitted Q-evaluation method for estimating the expected cumulative reward of a target policy, when the data are generated from an unknown behavior policy. We show that, by choosing network size appropriately, one can leverage any low-dimensional manifold structure in the Markov decision process and obtain a sample-efficient estimator without suffering from the curse of high data ambient dimensionality. Specifically, we establish a sharp error bound for fitted Q-evaluation, which depends on the intrinsic dimension of the state-action space, the smoothness of Bellman operator, and a function class-restricted $\chi^2$-divergence. It is noteworthy that the restricted $\chi^2$-divergence measures the behavior and target policies' {\it mismatch in the function space}, which can be small even if the two policies are not close to each other in their tabular forms. We also develop a novel approximation result for convolutional neural networks in Q-function estimation. Numerical experiments are provided to support our theoretical analysis.

approximation, machine learning, reinforcement learning, (18 more...)

2206.02887

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)