Goto

Collaborating Authors

Results


Topology of deep neural networks

arXiv.org Machine Learning

We study how the topology of a data set $M = M_a \cup M_b \subseteq \mathbb{R}^d$, representing two classes $a$ and $b$ in a binary classification problem, changes as it passes through the layers of a well-trained neural network, i.e., with perfect accuracy on training set and near-zero generalization error ($\approx 0.01\%$). The goal is to shed light on two mysteries in deep neural networks: (i) a nonsmooth activation function like ReLU outperforms a smooth one like hyperbolic tangent; (ii) successful neural network architectures rely on having many layers, even though a shallow network can approximate any function arbitrary well. We performed extensive experiments on the persistent homology of a wide range of point cloud data sets, both real and simulated. The results consistently demonstrate the following: (1) Neural networks operate by changing topology, transforming a topologically complicated data set into a topologically simple one as it passes through the layers. No matter how complicated the topology of $M$ we begin with, when passed through a well-trained neural network $f : \mathbb{R}^d \to \mathbb{R}^p$, there is a vast reduction in the Betti numbers of both components $M_a$ and $M_b$; in fact they nearly always reduce to their lowest possible values: $\beta_k\bigl(f(M_i)\bigr) = 0$ for $k \ge 1$ and $\beta_0\bigl(f(M_i)\bigr) = 1$, $i =a, b$. Furthermore, (2) the reduction in Betti numbers is significantly faster for ReLU activation than hyperbolic tangent activation as the former defines nonhomeomorphic maps that change topology, whereas the latter defines homeomorphic maps that preserve topology. Lastly, (3) shallow and deep networks transform data sets differently -- a shallow network operates mainly through changing geometry and changes topology only in its final layers, a deep one spreads topological changes more evenly across all layers.


Will we ever have Conscious Machines?

arXiv.org Artificial Intelligence

The question of whether artificial beings or machines could become self-aware or consciousness has been a philosophical question for centuries. The main problem is that self-awareness cannot be observed from an outside perspective and the distinction of whether something is really self-aware or merely a clever program that pretends to do so cannot be answered without access to accurate knowledge about the mechanism's inner workings. We review the current state-of-the-art regarding these developments and investigate common machine learning approaches with respect to their potential ability to become self-aware. We realise that many important algorithmic steps towards machines with a core consciousness have already been devised. For human-level intelligence, however, many additional techniques have to be discovered.


Emotion Recognition From Gait Analyses: Current Research and Future Directions

arXiv.org Machine Learning

Human gait refers to a daily motion that represents not only mobility, but it can also be used to identify the walker by either human observers or computers. Recent studies reveal that gait even conveys information about the walker's emotion. Individuals in different emotion states may show different gait patterns. The mapping between various emotions and gait patterns provides a new source for automated emotion recognition. Compared to traditional emotion detection biometrics, such as facial expression, speech and physiological parameters, gait is remotely observable, more difficult to imitate, and requires less cooperation from the subject. These advantages make gait a promising source for emotion detection. This article reviews current research on gait-based emotion detection, particularly on how gait parameters can be affected by different emotion states and how the emotion states can be recognized through distinct gait patterns. We focus on the detailed methods and techniques applied in the whole process of emotion recognition: data collection, preprocessing, and classification. At last, we discuss possible future developments of efficient and effective gait-based emotion recognition using the state of the art techniques on intelligent computation and big data.


Keeping it simple: Implementation and performance of the proto-principle of adaptation and learning in the language sciences

arXiv.org Machine Learning

It is predated by three publications only: the seminal work of McCulloch and Pitts (1943) that hypothesized how neurons might work by relying on analogy to electrical circuits; Donald Hebb's book The Organization of Behavior (1949), which famously stipulated the basic principle of association of neurons by means of neural co-activation (i.e., assembling); and Frank Rosenblatt's work on the Perceptron (Rosenblatt, 1958). Importantly, however, the Widrow-Hoff rule was the first one that was successfully applied to real-life problems (e.g., noise cancellation in telephone lines which is used to date; cf., Haykin, 1999). After the initial excitement and until the (more) recent successes, models such as those mentioned above that were inspired biologically or, more specifically, neurally were ignored in favour of machines implementing von Neumann's traditional architecture. During the 1970s, the Widrow-Hoff rule was accidentally rediscovered in Psychology by Rescorla and Wagner (1972) who worked on animal and human learning, and by Kohonen (1972) in his work on Self-Organizing Maps in Computer Science. Finally, the widely known and successful Connectionist Parallel-Distributed Processing Models have the Widrow-Hoff rule as their principal building block (cf., McClelland & Rumelhart, 1986).


Machine Learning Approaches For Motor Learning: A Short Review

arXiv.org Machine Learning

The use of machine learning to model motor learning mechanisms is still limited, while it could help to design novel interactive systems for movement learning or rehabilitation. This approach requires to account for the motor variability induced by motor learning mechanisms. This represents specific challenges concerning fast adaptability of the computational models, from small variations to more drastic changes, including new movement classes. We propose a short review on machine learning based movement models and their existing adaptation mechanisms. We discuss the current challenges for applying these models in motor learning support systems, delineating promising research directions at the intersection of machine learning and motor learning.


Intrinsic Motivation and Episodic Memories for Robot Exploration of High-Dimensional Sensory Spaces

arXiv.org Artificial Intelligence

This work presents an architecture that generates curiosity-driven goal-directed exploration behaviours for an image sensor of a microfarming robot. A combination of deep neural networks for offline unsupervised learning of low-dimensional features from images, and of online learning of shallow neural networks representing the inverse and forward kinematics of the system have been used. The artificial curiosity system assigns interest values to a set of pre-defined goals, and drives the exploration towards those that are expected to maximise the learning progress. We propose the integration of an episodic memory in intrinsic motivation systems to face catastrophic forgetting issues, typically experienced when performing online updates of artificial neural networks. Our results show that adopting an episodic memory system not only prevents the computational models from quickly forgetting knowledge that has been previously acquired, but also provides new avenues for modulating the balance between plasticity and stability of the models.


Artificial Intelligence for Social Good: A Survey

arXiv.org Artificial Intelligence

Its impact is drastic and real: Youtube's AIdriven recommendation system would present sports videos for days if one happens to watch a live baseball game on the platform [1]; email writing becomes much faster with machine learning (ML) based auto-completion [2]; many businesses have adopted natural language processing based chatbots as part of their customer services [3]. AI has also greatly advanced human capabilities in complex decision-making processes ranging from determining how to allocate security resources to protect airports [4] to games such as poker [5] and Go [6]. All such tangible and stunning progress suggests that an "AI summer" is happening. As some put it, "AI is the new electricity" [7]. Meanwhile, in the past decade, an emerging theme in the AI research community is the so-called "AI for social good" (AI4SG): researchers aim at developing AI methods and tools to address problems at the societal level and improve the wellbeing of the society.


A Comprehensive Survey on Transfer Learning

arXiv.org Machine Learning

Transfer learning aims at improving the performance of target learners on target domains by transferring the knowledge contained in different but related source domains. In this way, the dependence on a large number of target domain data can be reduced for constructing target learners. Due to the wide application prospects, transfer learning has become a popular and promising area in machine learning. Although there are already some valuable and impressive surveys on transfer learning, these surveys introduce approaches in a relatively isolated way and lack the recent advances in transfer learning. As the rapid expansion of the transfer learning area, it is both necessary and challenging to comprehensively review the relevant studies. This survey attempts to connect and systematize the existing transfer learning researches, as well as to summarize and interpret the mechanisms and the strategies in a comprehensive way, which may help readers have a better understanding of the current research status and ideas. Different from previous surveys, this survey paper reviews over forty representative transfer learning approaches from the perspectives of data and model. The applications of transfer learning are also briefly introduced. In order to show the performance of different transfer learning models, twenty representative transfer learning models are used for experiments. The models are performed on three different datasets, i.e., Amazon Reviews, Reuters-21578, and Office-31. And the experimental results demonstrate the importance of selecting appropriate transfer learning models for different applications in practice.


The reliability of a deep learning model in clinical out-of-distribution MRI data: a multicohort study

arXiv.org Machine Learning

Deep learning (DL) methods have in recent years yielded impressive results in medical imaging, with the potential to function as clinical aid to radiologists. However, DL models in medical imaging are often trained on public research cohorts with images acquired with a single scanner or with strict protocol harmonization, which is not representative of a clinical setting. The aim of this study was to investigate how well a DL model performs in unseen clinical data sets---collected with different scanners, protocols and disease populations---and whether more heterogeneous training data improves generalization. In total, 3117 MRI scans of brains from multiple dementia research cohorts and memory clinics, that had been visually rated by a neuroradiologist according to Scheltens' scale of medial temporal atrophy (MTA), were included in this study. By training multiple versions of a convolutional neural network on different subsets of this data to predict MTA ratings, we assessed the impact of including images from a wider distribution during training had on performance in external memory clinic data. Our results showed that our model generalized well to data sets acquired with similar protocols as the training data, but substantially worse in clinical cohorts with visibly different tissue contrasts in the images. This implies that future DL studies investigating performance in out-of-distribution (OOD) MRI data need to assess multiple external cohorts for reliable results. Further, by including data from a wider range of scanners and protocols the performance improved in OOD data, which suggests that more heterogeneous training data makes the model generalize better. To conclude, this is the most comprehensive study to date investigating the domain shift in deep learning on MRI data, and we advocate rigorous evaluation of DL models on clinical data prior to being certified for deployment.


Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges toward Responsible AI

arXiv.org Artificial Intelligence

In the last years, Artificial Intelligence (AI) has achieved a notable momentum that may deliver the best of expectations over many application sectors across the field. For this to occur, the entire community stands in front of the barrier of explainability, an inherent problem of AI techniques brought by sub-symbolism (e.g. ensembles or Deep Neural Networks) that were not present in the last hype of AI. Paradigms underlying this problem fall within the so-called eXplainable AI (XAI) field, which is acknowledged as a crucial feature for the practical deployment of AI models. This overview examines the existing literature in the field of XAI, including a prospect toward what is yet to be reached. We summarize previous efforts to define explainability in Machine Learning, establishing a novel definition that covers prior conceptual propositions with a major focus on the audience for which explainability is sought. We then propose and discuss about a taxonomy of recent contributions related to the explainability of different Machine Learning models, including those aimed at Deep Learning methods for which a second taxonomy is built. This literature analysis serves as the background for a series of challenges faced by XAI, such as the crossroads between data fusion and explainability. Our prospects lead toward the concept of Responsible Artificial Intelligence, namely, a methodology for the large-scale implementation of AI methods in real organizations with fairness, model explainability and accountability at its core. Our ultimate goal is to provide newcomers to XAI with a reference material in order to stimulate future research advances, but also to encourage experts and professionals from other disciplines to embrace the benefits of AI in their activity sectors, without any prior bias for its lack of interpretability.