Goto

Collaborating Authors

 Statistical Learning


Machine Learning Techniques for Diagnostic Differentiation of Mild Cognitive Impairment and Dementia

AAAI Conferences

Detection of cognitive impairment, especially at the early stages, is critical. Such detection has traditionally been performed manually by one or more clinicians based on reports and test results. Machine learning algorithms offer an alternative method of detection that may provide an automated process and valuable insights into diagnosis and classification. In this paper, we explore the use of neuropsychological and demographic data to predict Clinical Dementia Rating (CDR) scores (no dementia, very mild dementia, dementia) and clinical diagnoses (cognitively healthy, mild cognitive impairment, dementia) through the implementation of four machine learning algorithms, naรฏve Bayes (NB), C4.5 decision tree (DT), back-propagation neural network (NN), and support vector machine (SVM). Additionally, a feature selection method for reducing the number of neuropsychological and demographic data needed to make an accurate diagnosis was investigated. The NB classifier provided the best accuracies, while the SVM classifier proved to offer some of the lowest accuracies. We also illustrate that with the use of feature selection, accuracies can be improved. The experiments reported in this paper indicate that artificial intelligence techniques can be used to automate aspects of clinical diagnosis of individuals with cognitive impairment.


SALL-E: Situated Agent for Language Learning

AAAI Conferences

We describe ongoing research towards building a cognitively plausible system for near one-shot learning of the meanings of attribute words and object names, by grounding them in a sensory model. The system learns incrementally from human demonstrations recorded with the Microsoft Kinect, in which the demonstrator can use unrestricted natural language descriptions. We achieve near-one shot learning of simple objects and attributes by focusing solely on examples where the learning agent is confident, ignoring the rest of the data. We evaluate the system's learning ability by having it generate descriptions of presented objects, including objects it has never seen before, and comparing the system response against collected human descriptions of the same objects. We propose that our method of retrieving object examples with a k-nearest neighbor classifier using Mahalanobis distance corresponds to a cognitively plausible representation of objects. Our initial results show promise for achieving rapid, near one-shot, incremental learning of word meanings.


Time-Dependent Trajectory Regression on Road Networks via Multi-Task Learning

AAAI Conferences

Road travel costs are important knowledge hidden in large-scale GPS trajectory data sets, the discovery of which can benefit many applications such as intelligent route planning and automatic driving navigation. While there are previous studies which tackled this task by modeling it as a regression problem with spatial smoothness taken into account, they unreasonably assumed that the latent cost of each road remains unchanged over time. Other works on route planning and recommendation that have considered temporal factors simply assumed that the temporal dynamics be known in advance as a parametric function over time, which is not faithful to reality. To overcome these limitations, in this paper, we propose an extension to a previous static trajectory regression framework by learning the temporal dynamics of road travel costs in an innovative non-parametric manner which can effectively overcome the temporal sparsity problem. In particular, we unify multiple different trajectory regression problems in a multi-task framework by introducing a novel cross-task regularization which encourages temporal smoothness on the change of road travel costs. We then propose an efficient block coordinate descent method to solve the resulting problem by exploiting its separable structures and prove its convergence to global optimum. Experiments conducted on both synthetic and real data sets demonstrate the effectiveness of our method and its improved accuracy on travel time prediction.


Towards Cohesive Anomaly Mining

AAAI Conferences

In some applications, such as bioinformatics, social network analysis, and computational criminology, it is desirable to find compact clusters formed by a (very) small portion of objects in a large data set. Since such clusters are comprised of a small number of objects, they are extraordinary and anomalous with respect to the entire data set. This specific type of clustering task cannot be solved well by the conventional clustering methods since generally those methods try to assign most of the data objects into clusters. In this paper, we model this novel and application-inspired task as the problem of mining cohesive anomalies. We propose a general framework and a principled approach to tackle the problem. The experimental results on both synthetic and real data sets verify the effectiveness and efficiency of our approach.


Optimizing Objective Function Parameters for Strength in Computer Game-Playing

AAAI Conferences

The learning of evaluation functions from game records has been widely studied in the field of computer game-playing. Conventional learning methods optimize the evaluation function parameters by using the game records of expert players in order to imitate their plays. Such conventional methods utilize objective functions to increase the agreement between the moves selected by game-playing programs and the moves in the records of actual games. The methods, however, have a problem in that increasing the agreement does not always improve the strength of a program. Indeed, it is not clear how this agreement relates to the strength of a trained program. To address this problem, this paper presents a learning method to optimize objective function parameters for strength in game-playing. The proposed method employs an evolutionary learning algorithm with the strengths (Elo ratings) of programs as their fitness scores. Experimental results show that the proposed method is effective since programs using the objective function produced by the proposed method are superior to those using conventional objective functions.


Active Task Selection for Lifelong Machine Learning

AAAI Conferences

In a lifelong learning framework, an agent acquires knowledge incrementally over consecutive learning tasks, continually building upon its experience. Recent lifelong learning algorithms have achieved nearly identical performance to batch multi-task learning methods while reducing learning time by three orders of magnitude. In this paper, we further improve the scalability of lifelong learning by developing curriculum selection methods that enable an agent to actively select the next task to learn in order to maximize performance on future learning tasks. We demonstrate that active task selection is highly reliable and effective, allowing an agent to learn high performance models using up to 50% fewer tasks than when the agent has no control over the task order. We also explore a variant of transfer learning in the lifelong learning setting in which the agent can focus knowledge acquisition toward a particular target task.


Continuous Conditional Random Fields for Efficient Regression in Large Fully Connected Graphs

AAAI Conferences

When used for structured regression, powerful Conditional Random Fields (CRFs) are typically restricted to modeling effects of interactions among examples in local neighborhoods. Using more expressive representation would result in dense graphs, making these methods impractical for large-scale applications. To address this issue, we propose an effective CRF model with linear scale-up properties regarding approximate learning and inference for structured regression on large, fully connected graphs. The proposed method is validated on real-world large-scale problems of image de-noising and remote sensing. In conducted experiments, we demonstrated that dense connectivity provides an improvement in prediction accuracy. Inference time of less than ten seconds on graphs with millions of nodes and trillions of edges makes the proposed model an attractive tool for large-scale, structured regression problems.


Salient Object Detection via Low-Rank and Structured Sparse Matrix Decomposition

AAAI Conferences

Salient object detection provides an alternative solution to various image semantic understanding tasks such as object recognition, adaptive compression and image retrieval. Recently, low-rank matrix recovery (LR) theory has been introduced into saliency detection, and achieves impressed results. However, the existing LR-based models neglect the underlying structure of images, and inevitably degrade the associated performance. In this paper, we propose a Low-rank and Structured sparse Matrix Decomposition (LSMD) model for salient object detection. In the model, a tree-structured sparsity-inducing norm regularization is firstly introduced to provide a hierarchical description of the image structure to ensure the completeness of the extracted salient object. The similarity of saliency values within the salient object is then guaranteed by the $\ell _\infty$-norm. Finally, high-level priors are integrated to guide the matrix decomposition and enhance the saliency detection. Experimental results on the largest public benchmark database show that our model outperforms existing LR-based approaches and other state-of-the-art methods, which verifies the effectiveness and robustness of the structure cues in our model.


An Agent Design for Repeated Negotiation and Information Revelation with People

AAAI Conferences

Many negotiations in the real world are characterized by incomplete information, and participants' success depends on their ability to reveal information in a way that facilitates agreement without compromising the individual gains of agents. This paper presents a novel agent design for repeated negotiation in incomplete information settings that learns to reveal information strategically during the negotiation process. The agent used classical machine learning techniques to predict how people make and respond to offers during the negotiation, how they reveal information and their response to potential revelation actions by the agent. The agent was evaluated empirically in an extensive empirical study spanning hundreds of human subjects. Results show that the agent was able to outperform people. In particular, it learned (1) to make offers that were beneficial to people while not compromising its own benefit; (2) to incrementally reveal information to people in a way that increased its expected performance. The approach generalizes to new settings without the need to acquire additional data. This work demonstrates the efficacy of combining machine learning with opponent modeling techniques towards the design of computer agents for negotiating with people in settings of incomplete information.


Sample Complexity and Performance Bounds for Non-Parametric Approximate Linear Programming

AAAI Conferences

One of the most difficult tasks in value function approximation for Markov Decision Processes is finding an approximation architecture that is expressive enough to capture the important structure in the value function, while at the same time not overfitting the training samples. Recent results in non-parametric approximate linear programming (NP-ALP), have demonstrated that this can be done effectively using nothing more than a smoothness assumption on the value function. In this paper we extend these results to the case where samples come from real world transitions instead of the full Bellman equation, adding robustness to noise. In addition, we provide the first max-norm, finite sample performance guarantees for any form of ALP. NP-ALP is amenable to problems with large (multidimensional) or even infinite (continuous) action spaces, and does not require a model to select actions using the resulting approximate solution.