The authors of the Harrisburg University study make explicit their desire to provide "a significant advantage for law enforcement agencies and other intelligence agencies to prevent crime" as a co-author and former NYPD police officer outlined in the original press release. At a time when the legitimacy of the carceral state, and policing in particular, is being challenged on fundamental grounds in the United States, there is high demand in law enforcement for research of this nature, research which erases historical violence and manufactures fear through the so-called prediction of criminality. Publishers and funding agencies serve a crucial role in feeding this ravenous maw by providing platforms and incentives for such research. The circulation of this work by a major publisher like Springer would represent a significant step towards the legitimation and application of repeatedly debunked, socially harmful research in the real world. To reiterate our demands, the review committee must publicly rescind the offer for publication of this specific study, along with an explanation of the criteria used to evaluate it. Springer must issue a statement condemning the use of criminal justice statistics to predict criminality and acknowledging their role in incentivizing such harmful scholarship in the past. Finally, all publishers must refrain from publishing similar studies in the future.
This two-part review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this first part, we describe a classification for discoveries of physical matter (molecules, materials, devices), processes, and models and how they are unified as search problems. We then introduce a set of questions and considerations relevant to assessing the extent of autonomy. Finally, we describe many case studies of discoveries accelerated by or resulting from computer assistance and automation from the domains of synthetic chemistry, drug discovery, inorganic chemistry, and materials science. These illustrate how rapid advancements in hardware automation and machine learning continue to transform the nature of experimentation and modelling. Part two reflects on these case studies and identifies a set of open challenges for the field.
Life's most valuable asset is health. Continuously understanding the state of our health and modeling how it evolves is essential if we wish to improve it. Given the opportunity that people live with more data about their life today than any other time in history, the challenge rests in interweaving this data with the growing body of knowledge to compute and model the health state of an individual continually. This dissertation presents an approach to build a personal model and dynamically estimate the health state of an individual by fusing multi-modal data and domain knowledge. The system is stitched together from four essential abstraction elements: 1. the events in our life, 2. the layers of our biological systems (from molecular to an organism), 3. the functional utilities that arise from biological underpinnings, and 4. how we interact with these utilities in the reality of daily life. Connecting these four elements via graph network blocks forms the backbone by which we instantiate a digital twin of an individual. Edges and nodes in this graph structure are then regularly updated with learning techniques as data is continuously digested. Experiments demonstrate the use of dense and heterogeneous real-world data from a variety of personal and environmental sensors to monitor individual cardiovascular health state. State estimation and individual modeling is the fundamental basis to depart from disease-oriented approaches to a total health continuum paradigm. Precision in predicting health requires understanding state trajectory. By encasing this estimation within a navigational approach, a systematic guidance framework can plan actions to transition a current state towards a desired one. This work concludes by presenting this framework of combining the health state and personal graph model to perpetually plan and assist us in living life towards our goals.
Neural networks for structured data like graphs have been studied extensively in recent years. To date, the bulk of research activity has focused mainly on static graphs. However, most real-world networks are dynamic since their topology tends to change over time. Predicting the evolution of dynamic graphs is a task of high significance in the area of graph mining. Despite its practical importance, the task has not been explored in depth so far, mainly due to its challenging nature. In this paper, we propose a model that predicts the evolution of dynamic graphs. Specifically, we use a graph neural network along with a recurrent architecture to capture the temporal evolution patterns of dynamic graphs. Then, we employ a generative model which predicts the topology of the graph at the next time step and constructs a graph instance that corresponds to that topology. We evaluate the proposed model on several artificial datasets following common network evolving dynamics, as well as on real-world datasets. Results demonstrate the effectiveness of the proposed model.
Recent advancements in graph representation learning have led to the emergence of condensed encodings that capture the main properties of a graph. However, even though these abstract representations are powerful for downstream tasks, they are not equally suitable for visualisation purposes. In this work, we merge Mapper, an algorithm from the field of Topological Data Analysis (TDA), with the expressive power of Graph Neural Networks (GNNs) to produce hierarchical, topologically-grounded visualisations of graphs. These visualisations do not only help discern the structure of complex graphs but also provide a means of understanding the models applied to them for solving various tasks. We further demonstrate the suitability of Mapper as a topological framework for graph pooling by mathematically proving an equivalence with Min-Cut and Diff Pool. Building upon this framework, we introduce a novel pooling algorithm based on PageRank, which obtains competitive results with state of the art methods on graph classification benchmarks.
Advances in Data Science are lately permeating every field of Transportation Science and Engineering, making it straightforward to imagine that developments in the transportation sector will be data-driven. Nowadays, Intelligent Transportation Systems (ITS) could be arguably approached as a "story" intensively producing and consuming large amounts of data. A diversity of sensing devices densely spread over the infrastructure, vehicles or the travelers' personal devices act as sources of data flows that are eventually fed to software running on automatic devices, actuators or control systems producing, in turn, complex information flows between users, traffic managers, data analysts, traffic modeling scientists, etc. These information flows provide enormous opportunities to improve model development and decision-making. The present work aims to describe how data, coming from diverse ITS sources, can be used to learn and adapt data-driven models for efficiently operating ITS assets, systems and processes; in other words, for data-based models to fully become actionable. Grounded on this described data modeling pipeline for ITS, we define the characteristics, engineering requisites and challenges intrinsic to its three compounding stages, namely, data fusion, adaptive learning and model evaluation. We deliberately generalize model learning to be adaptive, since, in the core of our paper is the firm conviction that most learners will have to adapt to the everchanging phenomenon scenario underlying the majority of ITS applications. Finally, we provide a prospect of current research lines within the Data Science realm that can bring notable advances to data-based ITS modeling, which will eventually bridge the gap towards the practicality and actionability of such models.
Digital agriculture has the promise to transform agricultural throughput. It can do this by applying data science and engineering for mapping input factors to crop throughput, while bounding the available resources. In addition, as the data volumes and varieties increase with the increase in sensor deployment in agricultural fields, data engineering techniques will also be instrumental in collection of distributed data as well as distributed processing of the data. These have to be done such that the latency requirements of the end users and applications are satisfied. Understanding how farm technology and big data can improve farm productivity can significantly increase the world's food production by 2050 in the face of constrained arable land and with the water levels receding. While much has been written about digital agriculture's potential, little is known about the economic costs and benefits of these emergent systems. In particular, the on-farm decision making processes, both in terms of adoption and optimal implementation, have not been adequately addressed. For example, if some algorithm needs data from multiple data owners to be pooled together, that raises the question of data ownership. This paper is the first one to bring together the important questions that will guide the end-to-end pipeline for the evolution of a new generation of digital agricultural solutions, driving the next revolution in agriculture and sustainability under one umbrella.
Its impact is drastic and real: Youtube's AIdriven recommendation system would present sports videos for days if one happens to watch a live baseball game on the platform ; email writing becomes much faster with machine learning (ML) based auto-completion ; many businesses have adopted natural language processing based chatbots as part of their customer services . AI has also greatly advanced human capabilities in complex decision-making processes ranging from determining how to allocate security resources to protect airports  to games such as poker  and Go . All such tangible and stunning progress suggests that an "AI summer" is happening. As some put it, "AI is the new electricity" . Meanwhile, in the past decade, an emerging theme in the AI research community is the so-called "AI for social good" (AI4SG): researchers aim at developing AI methods and tools to address problems at the societal level and improve the wellbeing of the society.
The areas of machine learning and knowledge discovery in databases have considerably matured in recent years. In this article, we briefly review recent developments as well as classical algorithms that stood the test of time. Our goal is to provide a general introduction into different tasks such as learning from tabular data, behavioral data, or textual data, with a particular focus on actual and potential applications in behavioral sciences. The supplemental appendix to the article also provides practical guidance for using the methods by pointing the reader to proven software implementations. The focus is on R, but we also cover some libraries in other programming languages as well as systems with easy-to-use graphical interfaces.