AITopics | Edmonton

Collaborating Authors

Edmonton

Joint Level Generation and Translation Using Gameplay Videos

arXiv.org Artificial IntelligenceJun-28-2023

Procedural Content Generation via Machine Learning (PCGML) faces a significant hurdle that sets it apart from other fields, such as image or text generation, which is limited annotated data. Many existing methods for procedural level generation via machine learning require a secondary representation besides level images. However, the current methods for obtaining such representations are laborious and time-consuming, which contributes to this problem. In this work, we aim to address this problem by utilizing gameplay videos of two human-annotated games to develop a novel multi-tail framework that learns to perform simultaneous level translation and generation. The translation tail of our framework can convert gameplay video frames to an equivalent secondary representation, while its generation tail can produce novel level segments. Evaluation results and comparisons between our framework and baselines suggest that combining the level generation and translation tasks can lead to an overall improved performance regarding both tasks. This represents a possible solution to limited annotated level data, and we demonstrate the potential for future versions to generalize to unseen games.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.16662

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)
North America > United States (0.04)
North America > Canada > Alberta > Census Division No. 15 > Improvement District No. 9 > Banff (0.04)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment > Games > Computer Games (0.73)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Value-aware Importance Weighting for Off-policy Reinforcement Learning

De Asis, Kristopher, Graves, Eric, Sutton, Richard S.

arXiv.org Artificial IntelligenceJun-27-2023

Importance sampling is a central idea underlying off-policy prediction in reinforcement learning. It provides a strategy for re-weighting samples from a distribution to obtain unbiased estimates under another distribution. However, importance sampling weights tend to exhibit extreme variance, often leading to stability issues in practice. In this work, we consider a broader class of importance weights to correct samples in off-policy learning. We propose the use of $\textit{value-aware importance weights}$ which take into account the sample space to provide lower variance, but still unbiased, estimates under a target distribution. We derive how such weights can be computed, and detail key properties of the resulting importance weights. We then extend several reinforcement learning prediction algorithms to the off-policy setting with these weights, and evaluate them empirically.

artificial intelligence, machine learning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2306.15625

Country: North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers

Wang, Yan, Vasan, Gautham, Mahmood, A. Rupam

arXiv.org Artificial IntelligenceJun-26-2023

Real-time learning is crucial for robotic agents adapting to ever-changing, non-stationary environments. A common setup for a robotic agent is to have two different computers simultaneously: a resource-limited local computer tethered to the robot and a powerful remote computer connected wirelessly. Given such a setup, it is unclear to what extent the performance of a learning system can be affected by resource limitations and how to efficiently use the wirelessly connected powerful computer to compensate for any performance loss. In this paper, we implement a real-time learning system called the Remote-Local Distributed (ReLoD) system to distribute computations of two deep reinforcement learning (RL) algorithms, Soft Actor-Critic (SAC) and Proximal Policy Optimization (PPO), between a local and a remote computer. The performance of the system is evaluated on two vision-based control tasks developed using a robotic arm and a mobile robot. Our results show that SAC's performance degrades heavily on a resource-limited local computer. Strikingly, when all computations of the learning system are deployed on a remote workstation, SAC fails to compensate for the performance loss, indicating that, without careful consideration, using a powerful remote computer may not result in performance improvement. However, a carefully chosen distribution of computations of SAC consistently and substantially improves its performance on both tasks. On the other hand, the performance of PPO remains largely unaffected by the distribution of computations. In addition, when all computations happen solely on a powerful tethered computer, the performance of our system remains on par with an existing system that is well-tuned for using a single machine. ReLoD is the only publicly available system for real-time RL that applies to multiple robots for vision-based tasks.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2210.02317

Country: North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Sim-to-Real Surgical Robot Learning and Autonomous Planning for Internal Tissue Points Manipulation using Reinforcement Learning

Ou, Yafei, Tavakoli, Mahdi

arXiv.org Artificial IntelligenceJun-24-2023

Indirect simultaneous positioning (ISP), where internal tissue points are placed at desired locations indirectly through the manipulation of boundary points, is a type of subtask frequently performed in robotic surgeries. Although challenging due to complex tissue dynamics, automating the task can potentially reduce the workload of surgeons. This paper presents a sim-to-real framework for learning to automate the task without interacting with a real environment, and for planning preoperatively to find the grasping points that minimize local tissue deformation. A control policy is learned using deep reinforcement learning (DRL) in the FEM-based simulation environment and transferred to real-world situation. Grasping points are planned in the simulator by utilizing the trained policy using Bayesian optimization (BO). Inconsistent simulation performance is overcome by formulating the problem as a state augmented Markov decision process (MDP). Experimental results show that the learned policy places the internal tissue points accurately, and that the planned grasping points yield small tissue deformation among the trials. The proposed learning and planning scheme is able to automate internal tissue point manipulation in surgeries and has the potential to be generalized to complex surgical scenarios.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LRA.2023.3254860

2306.14085

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

UAlberta at SemEval-2023 Task 1: Context Augmentation and Translation for Multilingual Visual Word Sense Disambiguation

Ogezi, Michael, Hauer, Bradley, Omarov, Talgat, Shi, Ning, Kondrak, Grzegorz

arXiv.org Artificial IntelligenceJun-24-2023

We describe the systems of the University of Alberta team for the SemEval-2023 Visual Word Sense Disambiguation (V-WSD) Task. We present a novel algorithm that leverages glosses retrieved from BabelNet, in combination with text and image encoders. Furthermore, we compare language-specific encoders against the application of English encoders to translated texts. As the contexts given in the task datasets are extremely short, we also experiment with augmenting these contexts with descriptions generated by a language model. This yields substantial improvements in accuracy. We describe and evaluate additional V-WSD methods which use image generation and text-conditioned image segmentation. Overall, the results of our official submission rank us 18 out of 56 teams. Some of our unofficial results are even better than the official ones. Our code is publicly available at https://github.com/UAlberta-NLP/v-wsd.

focus word, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2306.14067

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Machine Learning Approach for Cancer Entities Association and Classification

Jeyakodi, G., Pal, Arkadeep, Gupta, Debapratim, Sarukeswari, K., Amouda, V.

arXiv.org Artificial IntelligenceJun-24-2023

As numerous biomedical research articles are published regularly, adding knowledge to the accumulated literature on different diseases, such as cancer, neurodegenerative diseases, and hereditary diseases. One of the leading causes of global mortality disease is cancer due to various reasons such as lifestyle habits, radiation exposure, viral infections, and tobacco consumption [1] [2]. These reasons ultimately make some genetic change in a cell of tissue which causes it to become cancerous. Due to the top priority given to cancer research compared to other human diseases, enormous articles were published [3] [4] in a short period [5]. It can serve as a relevant source for cancer knowledge discovery in different fields of diagnostics, application of drugs, genetic association, prevention, and treatment. An automate downloading of articles and extraction of related entities will advance the progression of the research faster. Natural Language Processing (NLP) helps in communicating computers with humans in their language and converts the unstructured data into structured data to improve the accuracy of text mining. NLP function guides to understanding the human query language to discover knowledge from literature without much manual effort [6]. Named Entity Recognition (NER) and text classification is used mainly for text mining [7].

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2306.00013

Country:

Asia > India > Puducherry (0.05)
Oceania > New Zealand > North Island > Waikato (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre: Research Report (0.52)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Add feedback

Correcting discount-factor mismatch in on-policy policy gradient methods

Che, Fengdi, Vasan, Gautham, Mahmood, A. Rupam

arXiv.org Artificial IntelligenceJun-23-2023

The policy gradient theorem gives a convenient form of the policy gradient in terms of three factors: an action value, a gradient of the action likelihood, and a state distribution involving discounting called the \emph{discounted stationary distribution}. But commonly used on-policy methods based on the policy gradient theorem ignores the discount factor in the state distribution, which is technically incorrect and may even cause degenerate learning behavior in some environments. An existing solution corrects this discrepancy by using $\gamma^t$ as a factor in the gradient estimate. However, this solution is not widely adopted and does not work well in tasks where the later states are similar to earlier states. We introduce a novel distribution correction to account for the discounted stationary distribution that can be plugged into many existing gradient estimators. Our correction circumvents the performance degradation associated with the $\gamma^t$ correction with a lower variance. Importantly, compared to the uncorrected estimators, our algorithm provides improved state emphasis to evade suboptimal policies in certain environments and consistently matches or exceeds the original performance on several OpenAI gym and DeepMind suite benchmarks.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2306.13284

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Exploring the Training Robustness of Distributional Reinforcement Learning against Noisy State Observations

Sun, Ke, Zhao, Yingnan, Jui, Shangling, Kong, Linglong

arXiv.org Artificial IntelligenceJun-21-2023

In real scenarios, state observations that an agent observes may contain measurement errors or adversarial noises, misleading the agent to take suboptimal actions or even collapse while training. In this paper, we study the training robustness of distributional Reinforcement Learning (RL), a class of state-of-the-art methods that estimate the whole distribution, as opposed to only the expectation, of the total return. Firstly, we validate the contraction of distributional Bellman operators in the State-Noisy Markov Decision Process (SN-MDP), a typical tabular case that incorporates both random and adversarial state observation noises. In the noisy setting with function approximation, we then analyze the vulnerability of least squared loss in expectation-based RL with either linear or nonlinear function approximation. By contrast, we theoretically characterize the bounded gradient norm of distributional RL loss based on the categorical parameterization equipped with the KL divergence. The resulting stable gradients while the optimization in distributional RL accounts for its better training robustness against state observation noises. Finally, extensive experiments on the suite of environments verified that distributional RL is less vulnerable against both random and adversarial noisy state observations compared with its expectation-based counterpart.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2109.08776

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Deep Learning of Dynamical System Parameters from Return Maps as Images

Stephens, Connor James, Blazquez, Emmanuel

arXiv.org Artificial IntelligenceJun-19-2023

We present a novel approach to system identification (SI) using deep learning techniques. Focusing on parametric system identification (PSI), we use a supervised learning approach for estimating the parameters of discrete and continuous-time dynamical systems, irrespective of chaos. To accomplish this, we transform collections of state-space trajectory observations into image-like data to retain the state-space topology of trajectories from dynamical systems and train convolutional neural networks to estimate the parameters of dynamical systems from these images. We demonstrate that our approach can learn parameter estimation functions for various dynamical systems, and by using training-time data augmentation, we are able to learn estimation functions whose parameter estimates are robust to changes in the sample fidelity of their inputs. Once trained, these estimation models return parameter estimations for new systems with negligible time and computation costs.

artificial intelligence, machine learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2306.11258

Country:

North America > Canada > Alberta > Census Division No. 11 > Edmonton Metropolitan Region > Edmonton (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Netherlands > South Holland > Noordwijk (0.04)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

Avram, Andrei-Marius, Mititelu, Verginica Barbu, Păiş, Vasile, Cercel, Dumitru-Clementin, Trăuşan-Matu, Ştefan

arXiv.org Artificial IntelligenceJun-17-2023

Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text. In this work, we evaluate the performance of the mBERT model for MWE identification in a multilingual context by training it on all 14 languages available in version 1.2 of the PARSEME corpus. We also incorporate lateral inhibition and language adversarial training into our methodology to create language-independent embeddings and improve its capabilities in identifying multiword expressions. The evaluation of our models shows that the approach employed in this work achieves better results compared to the best system of the PARSEME 1.2 competition, MTLB-STRUCT, on 11 out of 14 languages for global MWE identification and on 12 out of 14 languages for unseen MWE identification. Additionally, averaged across all languages, our best approach outperforms the MTLB-STRUCT system by 1.23% on global MWE identification and by 4.73% on unseen global MWE identification.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.10419

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(23 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Cognitive Science > Neuroscience (0.63)

Add feedback