Goto

Collaborating Authors

 Calgary


Mining GIS Data to Predict Urban Sprawl

arXiv.org Artificial Intelligence

This paper addresses the interesting problem of processing and analyzing data in geographic information systems (GIS) to achieve a clear perspective on urban sprawl. The term urban sprawl refers to overgrowth and expansion of low-density areas with issues such as car dependency and segregation between residential versus commercial use. Sprawl has impacts on the environment and public health. In our work, spatiotemporal features related to real GIS data on urban sprawl such as population growth and demographics are mined to discover knowledge for decision support. We adapt data mining algorithms, Apriori for association rule mining and J4.8 for decision tree classification to geospatial analysis, deploying the ArcGIS tool for mapping. Knowledge discovered by mining this spatiotemporal data is used to implement a prototype spatial decision support system (SDSS). This SDSS predicts whether urban sprawl is likely to occur. Further, it estimates the values of pertinent variables to understand how the variables impact each other. The SDSS can help decision-makers identify problems and create solutions for avoiding future sprawl occurrence and conducting urban planning where sprawl already occurs, thus aiding sustainable development. This work falls in the broad realm of geospatial intelligence and sets the stage for designing a large scale SDSS to process big data in complex environments, which constitutes part of our future work.


EnHMM: On the Use of Ensemble HMMs and Stack Traces to Predict the Reassignment of Bug Report Fields

arXiv.org Artificial Intelligence

Bug reports (BR) contain vital information that can help triaging teams prioritize and assign bugs to developers who will provide the fixes. However, studies have shown that BR fields often contain incorrect information that need to be reassigned, which delays the bug fixing process. There exist approaches for predicting whether a BR field should be reassigned or not. These studies use mainly BR descriptions and traditional machine learning algorithms (SVM, KNN, etc.). As such, they do not fully benefit from the sequential order of information in BR data, such as function call sequences in BR stack traces, which may be valuable for improving the prediction accuracy. In this paper, we propose a novel approach, called EnHMM, for predicting the reassignment of BR fields using ensemble Hidden Markov Models (HMMs), trained on stack traces. EnHMM leverages the natural ability of HMMs to represent sequential data to model the temporal order of function calls in BR stack traces. When applied to Eclipse and Gnome BR repositories, EnHMM achieves an average precision, recall, and F-measure of 54%, 76%, and 60% on Eclipse dataset and 41%, 69%, and 51% on Gnome dataset. We also found that EnHMM improves over the best single HMM by 36% for Eclipse and 76% for Gnome. Finally, when comparing EnHMM to Im.ML.KNN, a recent approach in the field, we found that the average F-measure score of EnHMM improves the average F-measure of Im.ML.KNN by 6.80% and improves the average recall of Im.ML.KNN by 36.09%. However, the average precision of EnHMM is lower than that of Im.ML.KNN (53.93% as opposed to 56.71%).


Detection of Alzheimer's Disease Using Graph-Regularized Convolutional Neural Network Based on Structural Similarity Learning of Brain Magnetic Resonance Images

arXiv.org Artificial Intelligence

Objective: This paper presents an Alzheimer's disease (AD) detection method based on learning structural similarity between Magnetic Resonance Images (MRIs) and representing this similarity as a graph. Methods: We construct the similarity graph using embedded features of the input image (i.e., Non-Demented (ND), Very Mild Demented (VMD), Mild Demented (MD), and Moderated Demented (MDTD)). We experiment and compare different dimension-reduction and clustering algorithms to construct the best similarity graph to capture the similarity between the same class images using the cosine distance as a similarity measure. We utilize the similarity graph to present (sample) the training data to a convolutional neural network (CNN). We use the similarity graph as a regularizer in the loss function of a CNN model to minimize the distance between the input images and their k-nearest neighbours in the similarity graph while minimizing the categorical cross-entropy loss between the training image predictions and the actual image class labels. Results: We conduct extensive experiments with several pre-trained CNN models and compare the results to other recent methods. Conclusion: Our method achieves superior performance on the testing dataset (accuracy = 0.986, area under receiver operating characteristics curve = 0.998, F1 measure = 0.987). Significance: The classification results show an improvement in the prediction accuracy compared to the other methods. We release all the code used in our experiments to encourage reproducible research in this area


Cascaded Models With Cyclic Feedback For Direct Speech Translation

arXiv.org Artificial Intelligence

Direct speech translation describes a scenario where only speech inputs and corresponding translations are available. Such data are notoriously limited. We present a technique that allows cascades of automatic speech recognition (ASR) and machine translation (MT) to exploit in-domain direct speech translation data in addition to out-of-domain MT and ASR data. After pre-training MT and ASR, we use a feedback cycle where the downstream performance of the MT system is used as a signal to improve the ASR system by self-training, and the MT component is fine-tuned on multiple ASR outputs, making it more tolerant towards spelling variations. A comparison to end-to-end speech translation using components of identical architecture and the same data shows gains of up to 3.8 BLEU points on LibriVoxDeEn and up to 5.1 BLEU points on CoVoST for German-to-English speech translation.


On the Philosophical, Cognitive and Mathematical Foundations of Symbiotic Autonomous Systems (SAS)

arXiv.org Artificial Intelligence

Symbiotic Autonomous Systems (SAS) are advanced intelligent and cognitive systems exhibiting autonomous collective intelligence enabled by coherent symbiosis of human-machine interactions in hybrid societies. Basic research in the emerging field of SAS has triggered advanced general AI technologies functioning without human intervention or hybrid symbiotic systems synergizing humans and intelligent machines into coherent cognitive systems. This work presents a theoretical framework of SAS underpinned by the latest advances in intelligence, cognition, computer, and system sciences. SAS are characterized by the composition of autonomous and symbiotic systems that adopt bio-brain-social-inspired and heterogeneously synergized structures and autonomous behaviors. This paper explores their cognitive and mathematical foundations. The challenge to seamless human-machine interactions in a hybrid environment is addressed. SAS-based collective intelligence is explored in order to augment human capability by autonomous machine intelligence towards the next generation of general AI, autonomous computers, and trustworthy mission-critical intelligent systems. Emerging paradigms and engineering applications of SAS are elaborated via an autonomous knowledge learning system that symbiotically works between humans and cognitive robots.


Learning Curve Theory

arXiv.org Machine Learning

Recently a number of empirical "universal" scaling law papers have been published, most notably by OpenAI. `Scaling laws' refers to power-law decreases of training or test error w.r.t. more data, larger neural networks, and/or more compute. In this work we focus on scaling w.r.t. data size $n$. Theoretical understanding of this phenomenon is largely lacking, except in finite-dimensional models for which error typically decreases with $n^{-1/2}$ or $n^{-1}$, where $n$ is the sample size. We develop and theoretically analyse the simplest possible (toy) model that can exhibit $n^{-\beta}$ learning curves for arbitrary power $\beta>0$, and determine whether power laws are universal or depend on the data distribution.


Persistent Rule-based Interactive Reinforcement Learning

arXiv.org Artificial Intelligence

Interactive reinforcement learning has allowed speeding up the learning process in autonomous agents by including a human trainer providing extra information to the agent in real-time. Current interactive reinforcement learning research has been limited to interactions that offer relevant advice to the current state only. Additionally, the information provided by each interaction is not retained and instead discarded by the agent after a single-use. In this work, we propose a persistent rule-based interactive reinforcement learning approach, i.e., a method for retaining and reusing provided knowledge, allowing trainers to give general advice relevant to more than just the current state. Our experimental results show persistent advice substantially improves the performance of the agent while reducing the number of interactions required for the trainer. Moreover, rule-based advice shows similar performance impact as state-based advice, but with a substantially reduced interaction count.


Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning

arXiv.org Artificial Intelligence

Dialogue policy learning based on reinforcement learning is difficult to be applied to real users to train dialogue agents from scratch because of the high cost. User simulators, which choose random user goals for the dialogue agent to train on, have been considered as an affordable substitute for real users. However, this random sampling method ignores the law of human learning, making the learned dialogue policy inefficient and unstable. We propose a novel framework, Automatic Curriculum Learning-based Deep Q-Network (ACL-DQN), which replaces the traditional random sampling method with a teacher policy model to realize the dialogue policy for automatic curriculum learning. The teacher model arranges a meaningful ordered curriculum and automatically adjusts it by monitoring the learning progress of the dialogue agent and the over-repetition penalty without any requirement of prior knowledge. The learning progress of the dialogue agent reflects the relationship between the dialogue agent's ability and the sampled goals' difficulty for sample efficiency. The over-repetition penalty guarantees the sampled diversity. Experiments show that the ACL-DQN significantly improves the effectiveness and stability of dialogue tasks with a statistically significant margin. Furthermore, the framework can be further improved by equipping with different curriculum schedules, which demonstrates that the framework has strong generalizability.


Modeling Heterogeneous Statistical Patterns in High-dimensional Data by Adversarial Distributions: An Unsupervised Generative Framework

arXiv.org Artificial Intelligence

Since the label collecting is prohibitive and time-consuming, unsupervised methods are preferred in applications such as fraud detection. Meanwhile, such applications usually require modeling the intrinsic clusters in high-dimensional data, which usually displays heterogeneous statistical patterns as the patterns of different clusters may appear in different dimensions. Existing methods propose to model the data clusters on selected dimensions, yet globally omitting any dimension may damage the pattern of certain clusters. To address the above issues, we propose a novel unsupervised generative framework called FIRD, which utilizes adversarial distributions to fit and disentangle the heterogeneous statistical patterns. When applying to discrete spaces, FIRD effectively distinguishes the synchronized fraudsters from normal users. Besides, FIRD also provides superior performance on anomaly detection datasets compared with SOTA anomaly detection methods (over 5% average AUC improvement). The significant experiment results on various datasets verify that the proposed method can better model the heterogeneous statistical patterns in high-dimensional data and benefit downstream applications.


I'm Sorry for Your Loss: Spectrally-Based Audio Distances Are Bad at Pitch

arXiv.org Artificial Intelligence

Growing research demonstrates that synthetic failure modes imply poor generalization. We compare commonly used audio-to-audio losses on a synthetic benchmark, measuring the pitch distance between two stationary sinusoids. The results are surprising: many have poor sense of pitch direction. These shortcomings are exposed using simple rank assumptions. Our task is trivial for humans but difficult for these audio distances, suggesting significant progress can be made in self-supervised audio learning by improving current losses.