AITopics | Coates, Mark

Collaborating Authors

Coates, Mark

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Active Learning on Attributed Graphs via Graph Cognizant Logistic Regression and Preemptive Query Generation

Regol, Florence, Pal, Soumyasundar, Zhang, Yingxue, Coates, Mark

arXiv.org Machine LearningJul-9-2020

Node classification in attributed graphs is an important task in multiple practical settings, but it can often be difficult or expensive to obtain labels. Active learning can improve the achieved classification performance for a given budget on the number of queried labels. The best existing methods are based on graph neural networks, but they often perform poorly unless a sizeable validation set of labelled nodes is available in order to choose good hyperparameters. We propose a novel graph-based active learning algorithm for the task of node classification in attributed graphs; our algorithm uses graph cognizant logistic regression, equivalent to a linearized graph convolutional neural network (GCN), for the prediction phase and maximizes the expected error reduction in the query phase. To reduce the delay experienced by a labeller interacting with the system, we derive a preemptive querying system that calculates a new query during the labelling process, and to address the setting where learning starts with almost no labelled data, we also develop a hybrid algorithm that performs adaptive model averaging of label propagation and linearized GCN inference. We conduct experiments on five public benchmark datasets, demonstrating a significant improvement over state-of-the-art approaches and illustrate the practical value of the method by applying it to a private microwave link network dataset.

neural network, neurology, node, (17 more...)

arXiv.org Machine Learning

2007.05003

Country:

Europe (1.00)
North America > Canada > Quebec > Montreal (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (0.73)
Research Report > Experimental Study (0.73)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.75)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.73)

Add feedback

Non-Parametric Graph Learning for Bayesian Graph Neural Networks

Pal, Soumyasundar, Malekmohammadi, Saber, Regol, Florence, Zhang, Yingxue, Xu, Yishi, Coates, Mark

arXiv.org Machine LearningJun-23-2020

Graphs are ubiquitous in modelling relational structures. Recent endeavours in machine learning for graph-structured data have led to many architectures and learning algorithms. However, the graph used by these algorithms is often constructed based on inaccurate modelling assumptions and/or noisy data. As a result, it fails to represent the true relationships between nodes. A Bayesian framework which targets posterior inference of the graph by considering it as a random quantity can be beneficial. In this paper, we propose a novel non-parametric graph model for constructing the posterior distribution of graph adjacency matrices. The proposed model is flexible in the sense that it can effectively take into account the output of graph-based learning algorithms that target specific tasks. In addition, model inference scales well to large graphs. We demonstrate the advantages of this model in three different problem settings: node classification, link prediction and recommendation.

deep learning, graph, neural network, (16 more...)

arXiv.org Machine Learning

2006.13335

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.82)

Add feedback

Bayesian Graph Convolutional Neural Networks Using Non-Parametric Graph Learning

Pal, Soumyasundar, Regol, Florence, Coates, Mark

arXiv.org Machine LearningOct-26-2019

Graph convolutional neural networks (GCNN) have been successfully applied to many different graph based learning tasks including node and graph classification, matrix completion, and learning of node embeddings. Despite their impressive performance, the techniques have a limited capability to incorporate the uncertainty in the underlined graph structure. In order to address this issue, a Bayesian GCNN (BGCN) framework was recently proposed. In this framework, the observed graph is considered to be a random realization from a parametric random graph model and the joint Bayesian inference of the graph and GCNN weights is performed. In this paper, we propose a non-parametric generative model for graphs and incorporate it within the BGCN framework. In addition to the observed graph, our approach effectively uses the node features and training labels in the posterior inference of graphs and attains superior or comparable performance in benchmark node classification tasks.

bayesian inference, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1910.12132

Country: North America (0.47)

Genre: Research Report (0.50)

Add feedback

Learning Gaussian Graphical Models with Ordered Weighted L1 Regularization

Mazza-Anthony, Cody, Mazoure, Bogdan, Coates, Mark

arXiv.org Machine LearningJun-6-2019

We address the task of identifying densely connected subsets of multivariate Gaussian random variables within a graphical model framework. We propose two novel estimators based on the Ordered Weighted $\ell_1$ (OWL) norm: 1) The Graphical OWL (GOWL) is a penalized likelihood method that applies the OWL norm to the lower triangle components of the precision matrix. 2) The column-by-column Graphical OWL (ccGOWL) estimates the precision matrix by performing OWL regularized linear regressions. Both methods can simultaneously identify highly correlated groups of variables and control the sparsity in the resulting precision matrix. We formulate GOWL such that it solves a composite optimization problem and establish that the estimator has a unique global solution. In addition, we prove sufficient grouping conditions for each column of the ccGOWL precision matrix estimate. We propose proximal descent algorithms to find the optimum for both estimators. For synthetic data where group structure is present, the ccGOWL estimator requires significantly reduced computation and achieves similar or greater accuracy than state-of-the-art estimators. Timing comparisons are presented and demonstrates the superior computational efficiency of the ccGOWL. We illustrate the grouping performance of the ccGOWL method on a cancer gene expression data set and an equities data set.

health & medicine, matrix, oncology, (19 more...)

arXiv.org Machine Learning

1906.02719

Genre: Research Report (0.64)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology > Leukemia (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

Bayesian graph convolutional neural networks for semi-supervised classification

Zhang, Yingxue, Pal, Soumyasundar, Coates, Mark, Üstebay, Deniz

arXiv.org Machine LearningNov-27-2018

Recently, techniques for applying convolutional neural networks to graph-structured data have emerged. Graph convolutional neural networks (GCNNs) have been used to address node and graph classification and matrix completion. Although the performance has been impressive, the current implementations have limited capability to incorporate uncertainty in the graph structure. Almost all GCNNs process a graph as though it is a ground-truth depiction of the relationship between nodes, but often the graphs employed in applications are themselves derived from noisy data or modelling assumptions. Spurious edges may be included; other edges may be missing between nodes that have very strong relationships. In this paper we adopt a Bayesian approach, viewing the observed graph as a realization from a parametric family of random graphs. We then target inference of the joint posterior of the random graph parameters and the node (or graph) labels. We present the Bayesian GCNN framework and develop an iterative learning procedure for the case of assortative mixed-membership stochastic block models. We present the results of experiments that demonstrate that the Bayesian formulation can provide better performance when there are very few labels available during the training process.

deep learning, neural network, node, (16 more...)

arXiv.org Machine Learning

1811.11103

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.48)
Government > Military (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.89)

Add feedback

Cost Adaptation for Robust Decentralized Swarm Behaviour

Henderson, Peter, Vertescher, Matthew, Meger, David, Coates, Mark

arXiv.org Artificial IntelligenceSep-29-2018

Decentralized receding horizon control (D-RHC) provides a mechanism for coordination in multi-agent settings without a centralized command center. However, combining a set of different goals, costs, and constraints to form an efficient optimization objective for D-RHC can be difficult. To allay this problem, we use a meta-learning process -- cost adaptation -- which generates the optimization objective for D-RHC to solve based on a set of human-generated priors (cost and constraint functions) and an auxiliary heuristic. We use this adaptive D-RHC method for control of mesh-networked swarm agents. This formulation allows a wide range of tasks to be encoded and can account for network delays, heterogeneous capabilities, and increasingly large swarms through the adaptation mechanism. We leverage the Unity3D game engine to build a simulator capable of introducing artificial networking failures and delays in the swarm. Using the simulator we validate our method on an example coordinated exploration task. We demonstrate that cost adaptation allows for more efficient and safer task completion under varying environment conditions and increasingly large swarm sizes. We release our simulator and code to the community for future work.

agent, artificial intelligence, optimization problem, (19 more...)

arXiv.org Artificial Intelligence

1709.07114

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report (0.40)

Industry:

Leisure & Entertainment > Games (0.34)
Information Technology (0.34)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Microwave breast cancer detection using Empirical Mode Decomposition features

Song, Hongchao, Li, Yunpeng, Coates, Mark, Men, Aidong

arXiv.org Machine LearningFeb-24-2017

Microwave-based breast cancer detection has been proposed as a complementary approach to compensate for some drawbacks of existing breast cancer detection techniques. Among the existing microwave breast cancer detection methods, machine learning-type algorithms have recently become more popular. These focus on detecting the existence of breast tumours rather than performing imaging to identify the exact tumour position. A key step of the machine learning approaches is feature extraction. One of the most widely used feature extraction method is principle component analysis (PCA). However, it can be sensitive to signal misalignment. This paper presents an empirical mode decomposition (EMD)-based feature extraction method, which is more robust to the misalignment. Experimental results involving clinical data sets combined with numerically simulated tumour responses show that combined features from EMD and PCA improve the detection performance with an ensemble selection-based classifier.

breast cancer detection, health & medicine, oncology, (17 more...)

arXiv.org Machine Learning

1702.07608

Country:

Europe (0.68)
North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science > Data Mining > Feature Extraction (0.77)
Information Technology > Data Science > Data Quality > Data Transformation (0.72)

Add feedback

Sparse Multivariate Factor Regression

Kharratzadeh, Milad, Coates, Mark

arXiv.org Machine LearningFeb-29-2016

We consider the problem of multivariate regression in a setting where the relevant predictors could be shared among different responses. We propose an algorithm which decomposes the coefficient matrix into the product of a long matrix and a wide matrix, with an elastic net penalty on the former and an $\ell_1$ penalty on the latter. The first matrix linearly transforms the predictors to a set of latent factors, and the second one regresses the responses on these factors. Our algorithm simultaneously performs dimension reduction and coefficient estimation and automatically estimates the number of latent factors from the data. Our formulation results in a non-convex optimization problem, which despite its flexibility to impose effective low-dimensional structure, is difficult, or even impossible, to solve exactly in a reasonable time. We specify an optimization algorithm based on alternating minimization with three different sets of updates to solve this non-convex problem and provide theoretical results on its convergence and optimality. Finally, we demonstrate the effectiveness of our algorithm via experiments on simulated and real data.

algorithm, health & medicine, optimization problem, (21 more...)

arXiv.org Machine Learning

1502.07334

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Banking & Finance (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Semi-parametric Order-based Generalized Multivariate Regression

Kharratzadeh, Milad, Coates, Mark

arXiv.org Machine LearningFeb-19-2016

In this paper, we consider a generalized multivariate regression problem where the responses are monotonic functions of linear transformations of predictors. We propose a semi-parametric algorithm based on the ordering of the responses which is invariant to the functional form of the transformation function. We prove that our algorithm, which maximizes the rank correlation of responses and linear transformations of predictors, is a consistent estimator of the true coefficient matrix. We also identify the rate of convergence and show that the squared estimation error decays with a rate of $o(1/\sqrt{n})$. We then propose a greedy algorithm to maximize the highly non-smooth objective function of our model and examine its performance through extensive simulations. Finally, we compare our algorithm with traditional multivariate regression algorithms over synthetic and real data.

algorithm, artificial intelligence, health & medicine, (18 more...)

arXiv.org Machine Learning

1602.06276

Country: North America > Canada > Quebec > Montreal (0.14)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback

Weblog Analysis for Predicting Correlations in Stock Price Evolutions

Kharratzadeh, Milad (McGill University) | Coates, Mark (McGill University)

AAAI ConferencesFeb-22-2012

We use data extracted from many weblogs to identify the underlying relations of a set of companies in the Standard and Poor (S\&P) 500 index. We define a pairwise similarity measure for the companies based on the weblog articles and then apply a graph clustering procedure. We show that it is possible to capture some interesting relations between companies using this method. As an application of this clustering procedure we propose a cluster-based portfolio selection method which combines information from the weblog data and historical stock prices. Through simulation experiments, we show that our method performs better (in terms of risk measures) than cluster-based portfolio strategies based on company sectors or historical stock prices. This suggests that the methodology has the potential to identify groups of companies whose stock prices are more likely to be correlated in the future.

artificial intelligence, banking & finance, stock price, (18 more...)

AAAI Conferences

Sixth International AAAI Conference on Weblogs and Social Media

Country: North America > Canada > Quebec > Montreal (0.14)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Communications (0.95)
Information Technology > Data Science > Data Mining (0.90)

Add feedback