Goto

Collaborating Authors

 Overview


Rediscovering alignment relations with Graph Convolutional Networks

arXiv.org Artificial Intelligence

Knowledge graphs are concurrently published and edited in the Web of data. Hence they may overlap, which makes key the task that consists in matching their content. This task encompasses the identification, within and across knowledge graphs, of nodes that are equivalent, more specific, or weakly related. In this article, we propose to match nodes of a knowledge graph by (i) learning node embeddings with Graph Convolutional Networks such that similar nodes have low distances in the embedding space, and (ii) clustering nodes based on their embeddings. We experimented this approach on a biomedical knowledge graph and particularly investigated the interplay between formal semantics and GCN models with the two following main focuses. Firstly, we applied various inference rules associated with domain knowledge, independently or combined, before learning node embeddings, and we measured the improvements in matching results. Secondly, while our GCN model is agnostic to the exact alignment relations (e.g., equivalence, weak similarity), we observed that distances in the embedding space are coherent with the "strength" of these different relations (e.g., smaller distances for equivalences), somehow corresponding to their rediscovery by the model.


Spoken Language Interaction with Robots: Research Issues and Recommendations, Report from the NSF Future Directions Workshop

arXiv.org Artificial Intelligence

With robotics rapidly advancing, more effective human-robot interaction is increasingly needed to realize the full potential of robots for society. While spoken language must be part of the solution, our ability to provide spoken language interaction capabilities is still very limited. The National Science Foundation accordingly convened a workshop, bringing together speech, language, and robotics researchers to discuss what needs to be done. The result is this report, in which we identify key scientific and engineering advances needed. Our recommendations broadly relate to eight general themes. First, meeting human needs requires addressing new challenges in speech technology and user experience design. Second, this requires better models of the social and interactive aspects of language use. Third, for robustness, robots need higher-bandwidth communication with users and better handling of uncertainty, including simultaneous consideration of multiple hypotheses and goals. Fourth, more powerful adaptation methods are needed, to enable robots to communicate in new environments, for new tasks, and with diverse user populations, without extensive re-engineering or the collection of massive training data. Fifth, since robots are embodied, speech should function together with other communication modalities, such as gaze, gesture, posture, and motion. Sixth, since robots operate in complex environments, speech components need access to rich yet efficient representations of what the robot knows about objects, locations, noise sources, the user, and other humans. Seventh, since robots operate in real time, their speech and language processing components must also. Eighth, in addition to more research, we need more work on infrastructure and resources, including shareable software modules and internal interfaces, inexpensive hardware, baseline systems, and diverse corpora.


Graph Kernels: State-of-the-Art and Future Challenges

arXiv.org Machine Learning

Among the data structures commonly used in machine learning, graphs are arguably one of the most general. Graphs allow modelling complex objects as a collection of entities (nodes) and of relationships between such entities (edges), each of which can be annotated by metadata such as categorical or vectorial node and edge features. Many ubiquitous data types can be understood as particular cases of graphs, including unstructured vectorial data as well as structured data types such as time series, images, volumetric data, point clouds or bags of entities, to name a few. Most importantly, numerous applications benefit from the extra flexibility that graph-based representations provide. In chemoinformatics, graphs have been used extensively to represent molecular compounds (Trinajstic, 2018), with nodes corresponding to atoms, edges to chemical bonds, and node and edge features encoding known chemical properties of each atom and bond in the molecule. Machine learning approaches operating on such graph-based representations of molecules are becoming increasingly successful in learning to predict complex molecular properties from large annotated data sets (Duvenaud et al., 2015; Gilmer et al., 2017; Wu et al., 2018), offering a promising set of tools for drug discovery (Vamathevan et al., 2019). In computational biology, graphs have likewise risen to prominence due to their ability to describe multifaceted interactions between (biological) entities.


Comment on Chapter 1

#artificialintelligence

Note on style - this initial blog post will be divided into three sections. In the first I will review what I understand to be the key points from the first chapter of the fastai book. In the second I will mention how, following the authors’ instructions, I setup a ‘workspace’ for DL. And finally I’ll discuss some open questions I have. Key points from Chapter 1 With this chapter Howard and Gugger provide a useful overview of the subject of Deep Learning, some extremely useful tips on how to setup a development environment for DL coding and analysis (more on that below) and provide a summary of their approach to teaching DL. The authors make great efforts to make the subject approachable, emphasising that neither advanced qualifications nor high level coding ability are necessary to implement Deep Learning techniques. Indeed, they say right from the beginning that they intend to give readers a sense of ‘the complete game’. I can say from experience of other courses or books in this area that this is a refreshingly different approach. They also provide a good overview of the history of the discipline - from McCulloch and Pitts’ notion of an artificial neuron to more contemporary concepts such as Parallel Distributed Processes (PDP). For me the most interesting topic is the difference between Machine Learning and more traditional forms of programming. From what I understand, traditional programming is based on the notion that inputs from the user will go through a function (defined by the programmer) and specific output(s) will be the result. This approach works well when, for example, we want to automate repetitive tasks. However it is not suitable for more complex or conceptual tasks, such as recognising the difference between a cat or dog, imitating a particular author’s writing style or making a good movie recommendation. The reason traditional approaches don’t work here is that a programmer would need to specify every single aspect relevant to the task. A far better approach then is one were the machine itself can ‘learn’ i.e. the programme (or model) has a process inherent to itself that enables it to output a result that is intelligible and accurate to humans. To make this clearer, I briefly touch on the solution that ultimately caught on, which was conceived by Arthur Samuel and called Machine Learning. ML essentially involves taking data, weighting it in some way (through labelling the data for instance) and training the programme (now referred to as a model) to recognise patterns within that data. This process will repeat, with the weights adjusting through each cycle, until the programmer considers the programme sufficiently accurate. Interestingly once the model is trained, it can be used in the manner of a traditional programme. That means novel data can be introduced, without weighting, and the model will then make predictions - again, for example, whether a picture shows a cat or dog. While this training process is based on repetition, training to frequently on the same data set will actually decrease accuracy. This is a situation known as over-fitting where the model makes predications to close to its training data. My summary here may sound very theoretical but Howard and Gugger present their account in quite a practical fashion with lots of coding and real life examples. Setting up a DL coding environment One surprising element of this course is the great advice that Howard and Gugger offer on how to set up your working environment for DL projects. In fact, this may have been the feature that ultimately persuaded me to work through the entire course. The best thing to do is check their site for the details link but just to give a brief summary of what they describe: Haward and Gugger have developed a framework called fastai that enables users to access DL techniques in PyTorch in a much more straightforward manner than is possible than through coding for PyTorch directly. (I’ve not personally used PyTorch but this is my understanding of what fastai does.) One consequence of this is that a GPU is required for fastai to function. I do not have a GPU in my rather cheap Lenovo laptop but not to worry as the authors provide a very useful, and thorough, guide on how to set up a cloud system to run the course exercises, which are written in standard Jupyter Notebooks (although Google Colab versions are also available). I would emphasise that this is in no way an intimidating or difficult process, in fact I’m almost stunned by how easy it is to set up a cloud computer. Personally I’m using a free service called Gradient which is offered by a company known as Paperspace. While there are some restrictions (such as the cloud system shutting down automatically after 6 hours) Paperspace have integrated fastai’s Jupyter Notebooks into their service, which means you can jump right in once you’ve set it up. The only complaint I have, and it’s very minor, is that for Paperspace do give you the option of running the course notebooks on systems without GPUs, which makes no sense since the notebooks won’t work without GPUs. Also a few times, when I’ve started the virtual machine, it seems to have automatically selected the CPU system. Anyway I certainly can’t complain, this is a great service and I think Paperspace should be given some credit for making DL this accessible for free to basically anyone. As a side note, another extremely useful discussion on setting up your coding environment can be found in Wes McKinney’s Python for Data Analysis. McKinney’s text focuses far more on the mechanics of data analysis, personally I see it more as a reference book than something I would read cover to cover. Nevertheless the opening chapter were he discusses the basics modules required for doing data analysis in Python is something I return to every time I setup a new PC. Open questions One point I’m not entirely clear on is how the theoretical notion of artificial neurons can produce intelligible results. Having done some other more mathematical focused courses on this subject I believe the reason that artificial neurons, in particular through layering, can do this is to with the fact they can be thought of as strings of matrix multiplications. This allows the system to compute linear algebra, which enable the computer to predict an answer. I’m not completely happy with my own explanation here. I do think at some point I will have to do a deep dive on the Maths but for now this is my understanding.


A contribution to Optimal Transport on incomparable spaces

arXiv.org Machine Learning

Optimal Transport is a theory that allows to define geometrical notions of distance between probability distributions and to find correspondences, relationships, between sets of points. Many machine learning applications are derived from this theory, at the frontier between mathematics and optimization. This thesis proposes to study the complex scenario in which the different data belong to incomparable spaces. In particular we address the following questions: how to define and apply Optimal Transport between graphs, between structured data? How can it be adapted when the data are varied and not embedded in the same metric space? This thesis proposes a set of Optimal Transport tools for these different cases. An important part is notably devoted to the study of the Gromov-Wasserstein distance whose properties allow to define interesting transport problems on incomparable spaces. More broadly, we analyze the mathematical properties of the various proposed tools, we establish algorithmic solutions to compute them and we study their applicability in numerous machine learning scenarii which cover, in particular, classification, simplification, partitioning of structured data, as well as heterogeneous domain adaptation.


Challenges of Applying Deep Reinforcement Learning in Dynamic Dispatching

arXiv.org Artificial Intelligence

Dynamic dispatching aims to smartly allocate the right resources to the right place at the right time. Dynamic dispatching is one of the core problems for operations optimization in the mining industry. Theoretically, deep reinforcement learning (RL) should be a natural fit to solve this problem. However, the industry relies on heuristics or even human intuitions, which are often short-sighted and sub-optimal solutions. In this paper, we review the main challenges in using deep RL to address the dynamic dispatching problem in the mining industry.


Multi-Agent Active Search using Realistic Depth-Aware Noise Model

arXiv.org Artificial Intelligence

The search for objects of interest in an unknown environment by making data-collection decisions (i.e., active search or active sensing) has robotics applications in many fields, including the search and rescue of human survivors following disasters, detecting gas leaks or locating and preventing animal poachers. Existing algorithms often prioritize the location accuracy of objects of interest while other practical issues such as the reliability of object detection as a function of distance and lines of sight remain largely ignored. An additional challenge is that in many active search scenarios, communication infrastructure may be damaged, unreliable, or unestablished, making centralized control of multiple search agents impractical. We present an algorithm called Noise-Aware Thompson Sampling (NATS) that addresses these issues for multiple ground-based robot agents performing active search considering two sources of sensory information from monocular optical imagery and sonar tracking. NATS utilizes communications between robot agents in a decentralized manner that is robust to intermittent loss of communication links. Additionally, it takes into account object detection uncertainty from depth as well as environmental occlusions. Using simulation results, we show that NATS significantly outperforms existing methods such as information-greedy policies or exhaustive search. We demonstrate the real-world viability of NATS using a photo-realistic environment created in the Unreal Engine 4 game development platform with the AirSim plugin.


Artificial Intelligence Decision Support for Medical Triage

arXiv.org Artificial Intelligence

Applying state-of-the-art machine learning and natural language processing on approximately one million of teleconsultation records, we developed a triage system, now certified and in use at the largest European telemedicine provider. The system evaluates care alternatives through interactions with patients via a mobile application. Reasoning on an initial set of provided symptoms, the triage application generates AIpowered, personalized questions to better characterize the problem and recommends the most appropriate point of care and time frame for a consultation. The underlying technology was developed to meet the needs for performance, transparency, user acceptance and ease of use, central aspects to the adoption of AIbased decision support systems. Providing such remote guidance at the beginning of the chain of care has significant potential for improving cost efficiency, patient experience and outcomes. Being remote, always available and highly scalable, this service is fundamental in high demand situations, such as the current COVID-19 outbreak. Introduction Shortage of physicians and increasing healthcare costs have created a need for digital solutions to better optimize medical resources. In addition, patient expectations for mobile, fast and easy 24/7 access to doctors and health services drive the development of patient-centered solutions.


Council Post: How Can We Balance Progress And Privacy With Machine Learning?

#artificialintelligence

Depending on who you talk to, artificial intelligence (AI) is either the most exciting and transformative technology in our lifetime or the most dangerous and troubling. Science fiction has shaped some of these AI worries; movies like I, Robot, The Matrix and the entire Terminator franchise are built on the notion that machines can become conscious and independent in their thoughts and actions -- an understandably alarming idea. But I think that the casual way we talk about AI in pop culture contributes to unfounded fears about the machines taking over and distracts from valid concerns about privacy. It's helpful to shift this conversation from AI as a whole to the more specific domain of machine learning (ML), and how we can harness its advantages and minimize its drawbacks. Machine Learning is a subset of AI that describes how computer algorithms get better at solving their domain specific problems by leveraging large data sets with which to train themselves.


Overcoming Negative Transfer: A Survey

arXiv.org Machine Learning

Transfer learning (TL) tries to utilize data or knowledge from one or more source domains to facilitate the learning in a target domain. It is particularly useful when the target domain has few or no labeled data, due to annotation expense, privacy concerns, etc. Unfortunately, the effectiveness of TL is not always guaranteed. Negative transfer (NT), i.e., the source domain data/knowledge cause reduced learning performance in the target domain, has been a long-standing and challenging problem in TL. Various approaches to overcome NT have been proposed in the literature. However, there has not been a systematic survey on overcoming NT. This paper fills the gap, by categorizing and reviewing near 100 approaches for combating NT, from four perspectives: source data quality, target data quality, domain divergence, and integrated algorithms. NT in related fields, e.g., multi-task learning, multilingual models, and lifelong learning, is also discussed.