AITopics

2212.10748

Country:

North America > Canada > Ontario > National Capital Region > Ottawa (0.15)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
Europe > United Kingdom (0.04)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

#artificialintelligenceDec-19-2022, 06:21:07 GMT

GitHub - haifengl/smile: Statistical Machine Intelligence & Learning Engine

Smile (Statistical Machine Intelligence and Learning Engine) is a fast and comprehensive machine learning, NLP, linear algebra, graph, interpolation, and visualization system in Java and Scala. With advanced data structures and algorithms, Smile delivers state-of-art performance. Smile is well documented and please check out the project website for programming guides and more information. Smile covers every aspect of machine learning, including classification, regression, clustering, association rule mining, feature selection, manifold learning, multidimensional scaling, genetic algorithms, missing value imputation, efficient nearest neighbor search, etc. Feature Selection: Genetic Algorithm based Feature Selection, Ensemble Learning based Feature Selection, TreeSHAP, Signal Noise ratio, Sum Squares ratio. You can use the libraries through Maven central repository by adding the following to your project pom.xml file.

algorithm, library, statistical machine intelligence, (10 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.32)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.31)

Gupta, Piyush, Srivastava, Vaibhav

Deterministic Sequencing of Exploration and Exploitation for Reinforcement Learning

We propose Deterministic Sequencing of Exploration and Exploitation (DSEE) algorithm with interleaving exploration and exploitation epochs for model-based RL problems that aim to simultaneously learn the system model, i.e., a Markov decision process (MDP), and the associated optimal policy. During exploration, DSEE explores the environment and updates the estimates for expected reward and transition probabilities. During exploitation, the latest estimates of the expected reward and transition probabilities are used to obtain a robust policy with high probability. We design the lengths of the exploration and exploitation epochs such that the cumulative regret grows as a sub-linear function of time.

data mining, machine learning, reinforcement learning, (17 more...)

2209.05408

Country: North America > United States > Michigan (0.28)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Jerbi, Sofiene, Cornelissen, Arjan, Ozols, Māris, Dunjko, Vedran

Quantum policy gradient algorithms

When studying the potential advantages of quantum computing in machine learning, a natural question that arises is whether quantum algorithms that exploit quantum access to data can speed up learning. In the context of supervised learning, this led to the development of algorithms based on quantum RAMs, which can achieve high-degree polynomial improvements over their classical analogs [1]. In reinforcement learning, where we consider learning agents interacting with task environments, the question becomes: can quantum interactions with an environment, and in particular the ability to explore several trajectories in superposition, be beneficial for a learning agent. In recent years, several works have approached this question from a variety of angles [2]: based on Grover's algorithm [3], some works have for instance shown that searching for an optimal sequence of actions in an environment can be done using quadratically fewer interactions given the appropriate oracular access to the environment [4, 5, 6]. Other works have considered the more general problem of finding the optimal policy in a Markov Decision Process (MDP), and have found that up to quadratic speed-ups in the number of interactions are also possible, again given the proper oracular access [7, 8, 9, 10]. Finally, tailored MDP environments (based, e.g., on Simon's problem) have also been introduced, which allow for exponential quantum speed-ups in learning times compared to the best classical agents [11]. Yet, all the quantum algorithms that have been proposed in this quantum-accessible setting remain inefficient in the most well-publicised use cases of reinforcement learning, such as Go [12], city navigation [13], and computer games [14]: environments with large state-action spaces. Indeed, aside from the task-specific algorithms of Ref. [11], the proposed algorithms scale at best as the square root of the size of the state-action space, which is intractable in most modern-day applications that deal for instance with image-based inputs. In the classical literature, modern approaches to reinforcement learning in large spaces commonly replace the explicit storage of a policy (and/or a value function) in a table of values by a parametrized model (e.g., a

artificial intelligence, machine learning, reinforcement learning, (14 more...)

doi: 10.4230/LIPIcs.TQC.2023.13

2212.09328

Country:

Asia > Middle East > Jordan (0.04)
Europe > Netherlands > South Holland > Leiden (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Europe > Austria > Tyrol > Innsbruck (0.04)

Genre: Research Report (0.64)

Industry: Leisure & Entertainment > Games (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Tabakhi, Sina, Suvon, Mohammod Naimul Islam, Ahadian, Pegah, Lu, Haiping

Multimodal Learning for Multi-Omics: A Survey

With advanced imaging, sequencing, and profiling technologies, multiple omics data become increasingly available and hold promises for many healthcare applications such as cancer diagnosis and treatment. Multimodal learning for integrative multi-omics analysis can help researchers and practitioners gain deep insights into human diseases and improve clinical decisions. However, several challenges are hindering the development in this area, including the availability of easily accessible open-source tools. This survey aims to provide an up-to-date overview of the data challenges, fusion approaches, datasets, and software tools from several new perspectives. We identify and investigate various omics data challenges that can help us understand the field better. We categorize fusion approaches comprehensively to cover existing methods in this area. We collect existing open-source tools to facilitate their broader utilization and development. We explore a broad range of omics data modalities and a list of accessible datasets. Finally, we summarize future directions that can potentially address existing gaps and answer the pressing need to advance multimodal learning for multi-omics data analysis.

bioinformatics, machine learning, programming language, (20 more...)

doi: 10.1142/S2811032322500047

2211.16509

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(9 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(4 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
(10 more...)

Zharova, Alona, Scherz, Antonia

Multistep Multiappliance Load Prediction

A well-performing prediction model is vital for a recommendation system suggesting actions for energy-efficient consumer behavior. However, reliable and accurate predictions depend on informative features and a suitable model design to perform well and robustly across different households and appliances. Moreover, customers' unjustifiably high expectations of accurate predictions may discourage them from using the system in the long term. In this paper, we design a three-step forecasting framework to assess predictability, engineering features, and deep learning architectures to forecast 24 hourly load values. First, our predictability analysis provides a tool for expectation management to cushion customers' anticipations. Second, we design several new weather-, time- and appliance-related parameters for the modeling procedure and test their contribution to the model's prediction performance. Third, we examine six deep learning techniques and compare them to tree- and support vector regression benchmarks. We develop a robust and accurate model for the appliance-level load prediction based on four datasets from four different regions (US, UK, Austria, and Canada) with an equal set of appliances. The empirical results show that cyclical encoding of time features and weather indicators alongside a long-short term memory (LSTM) model offer the optimal performance.

artificial intelligence, deep learning, machine learning, (13 more...)

2212.09426

Country:

North America > Canada (0.24)
Europe > Austria (0.24)
Europe > Germany > Berlin (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry: Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)

arXiv.org Artificial IntelligenceDec-17-2022

Reranking Overgenerated Responses for End-to-End Task-Oriented Dialogue Systems

Hu, Songbo, Vulić, Ivan, Liu, Fangyu, Korhonen, Anna

End-to-end (E2E) task-oriented dialogue (ToD) systems are prone to fall into the so-called "likelihood trap", resulting in generated responses which are dull, repetitive, and often inconsistent with dialogue history. Comparing ranked lists of multiple generated responses against the "gold response" (from evaluation data) reveals a wide diversity in response quality, with many good responses placed lower in the ranked list. The main challenge, addressed in this work, is then how to reach beyond greedily generated system responses, that is, how to obtain and select such high-quality responses from the list of overgenerated responses at inference without availability of the gold response. To this end, we propose a simple yet effective reranking method which aims to select high-quality items from the lists of responses initially overgenerated by the system. The idea is to use any sequence-level (similarity) scoring function to divide the semantic space of responses into high-scoring versus low-scoring partitions. At training, the high-scoring partition comprises all generated responses whose similarity to the gold response is higher than the similarity of the greedy response to the gold response. At inference, the aim is to estimate the probability that each overgenerated response belongs to the high-scoring partition, given only previous dialogue history. We validate the robustness and versatility of our proposed method on the standard MultiWOZ dataset: our methods improve a state-of-the-art E2E ToD system by 2.0 BLEU, 1.6 ROUGE, and 1.3 METEOR scores, achieving new peak results. Additional experiments on the BiTOD dataset and human evaluation further ascertain the generalisability and effectiveness of the proposed framework.

computational linguistic, machine learning, natural language, (20 more...)

2211.03648

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Italy > Tuscany > Florence (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(25 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
(3 more...)

Li, Yimeng, Debnath, Arnab, Stein, Gregory J., Kosecka, Jana

Comparison of Model-Free and Model-Based Learning-Informed Planning for PointGoal Navigation

arXiv.org Artificial IntelligenceDec-17-2022

In recent years several learning approaches to point goal navigation in previously unseen environments have been proposed. They vary in the representations of the environments, problem decomposition, and experimental evaluation. In this work, we compare the state-of-the-art Deep Reinforcement Learning based approaches with Partially Observable Markov Decision Process (POMDP) formulation of the point goal navigation problem. We adapt the (POMDP) sub-goal framework proposed by [1] and modify the component that estimates frontier properties by using partial semantic maps of indoor scenes built from images' semantic segmentation. In addition to the well-known completeness of the model-based approach, we demonstrate that it is robust and efficient in that it leverages informative, learned properties of the frontiers compared to an optimistic frontier-based planner. We also demonstrate its data efficiency compared to the end-to-end deep reinforcement learning approaches. We compare our results against an optimistic planner, ANS and DD-PPO on Matterport3D dataset using the Habitat Simulator. We show comparable, though slightly worse performance than the SOTA DD-PPO approach, yet with far fewer data.

machine learning, navigation, reinforcement learning, (16 more...)

2212.08801

Country: North America > United States > Florida > Hillsborough County > University (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

arXiv.org Artificial IntelligenceDec-17-2022

Pre-Trained Image Encoder for Generalizable Visual Reinforcement Learning

Yuan, Zhecheng, Xue, Zhengrong, Yuan, Bo, Wang, Xueqian, Wu, Yi, Gao, Yang, Xu, Huazhe

Learning generalizable policies that can adapt to unseen environments remains challenging in visual Reinforcement Learning (RL). Existing approaches try to acquire a robust representation via diversifying the appearances of in-domain observations for better generalization. Limited by the specific observations of the environment, these methods ignore the possibility of exploring diverse real-world image datasets. In this paper, we investigate how a visual RL agent would benefit from the off-the-shelf visual representations. Surprisingly, we find that the early layers in an ImageNet pre-trained ResNet model could provide rather generalizable representations for visual RL. Hence, we propose Pre-trained Image Encoder for Generalizable visual reinforcement learning (PIE-G), a simple yet effective framework that can generalize to the unseen visual scenarios in a zero-shot manner. Extensive experiments are conducted on DMControl Generalization Benchmark, DMControl Manipulation Tasks, Drawer World, and CARLA to verify the effectiveness of PIE-G. Empirical evidence suggests PIE-G improves sample efficiency and significantly outperforms previous state-of-the-art methods in terms of generalization performance. In particular, PIE-G boasts a 55% generalization performance gain on average in the challenging video background setting. Project Page: https://sites.google.com/view/pie-g/home.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

2212.0886

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Artificial IntelligenceDec-16-2022

Partially Observable RL with B-Stability: Unified Structural Condition and Sharp Sample-Efficient Algorithms

Chen, Fan, Bai, Yu, Mei, Song

Partial Observability -- where agents can only observe partial information about the true underlying state of the system -- is ubiquitous in real-world applications of Reinforcement Learning (RL). Theoretically, learning a near-optimal policy under partial observability is known to be hard in the worst case due to an exponential sample complexity lower bound. Recent work has identified several tractable subclasses that are learnable with polynomial samples, such as Partially Observable Markov Decision Processes (POMDPs) with certain revealing or decodability conditions. However, this line of research is still in its infancy, where (1) unified structural conditions enabling sample-efficient learning are lacking; (2) existing sample complexities for known tractable subclasses are far from sharp; and (3) fewer sample-efficient algorithms are available than in fully observable RL. This paper advances all three aspects above for Partially Observable RL in the general setting of Predictive State Representations (PSRs). First, we propose a natural and unified structural condition for PSRs called \emph{B-stability}. B-stable PSRs encompasses the vast majority of known tractable subclasses such as weakly revealing POMDPs, low-rank future-sufficient POMDPs, decodable POMDPs, and regular PSRs. Next, we show that any B-stable PSR can be learned with polynomial samples in relevant problem parameters. When instantiated in the aforementioned subclasses, our sample complexities improve substantially over the current best ones. Finally, our results are achieved by three algorithms simultaneously: Optimistic Maximum Likelihood Estimation, Estimation-to-Decisions, and Model-Based Optimistic Posterior Sampling. The latter two algorithms are new for sample-efficient learning of POMDPs/PSRs.

artificial intelligence, machine learning, pomdp, (14 more...)

2209.1499

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Leisure & Entertainment (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)