SPE
Can Enterra's Advanced AI Systems Stop The Fake News Epidemic?
The simplest way to eliminate the spread of fake news would be to limit ourselves to a small group of mainstream publishers who do all their own reporting and fact-checking. The counterargument, of course, is that an open and democratic society allows for a wide range of voices, not just the ones a small cabal of editors deem acceptable. Fake news promises to destroy this system and undermine trust and democracy, which is why addressing fake news has become one of the tech industry's most significant and important challenges. His initial focus, post-9/11, was on national security, which is how he first become intrigued by the advantages AI offers in analyzing complex data sets. As 2017's fake news scandals grew, DeAngelis was approached by leading media industry veteran Greg D'Alba, CEO of VIDL News, to apply the same type of analysis Enterra was using to control the complex value chains of Fortune 500 companies to the media industry, where D'Alba saw a growing need to verify and validate news stories.
Questioning AI: what can scientists learn from artificial intelligence? – Science Weekly podcast
In October 2017, researchers at Google DeepMind published a paper on an artificial intelligence (AI) program called AlphaGo Zero. Unlike previous incarnations of AlphaGo, this updated version mastered the game of Go through self-play alone. Talking about the achievement, lead researcher David Silver explained that AlphaGo Zero had invented "its own variants which humans don't even know about or play at the moment." And it's here that a new and exciting use for AI comes to light. Could it be that AI might teach humans about the world around us?
Drone hit newly erected crane during Kent site survey - report
A pilot has flown a drone into a crane, according to an air-accident report. The pilot had planned the drone flight in Kent with four reference points, all at 400ft above ground level - higher than three existing cranes on the site. But another crane was erected after his site safety visit, and on take-off the drone crashed into the jib of the new structure, damaging the unmanned craft. The crash, in June last year, is listed in the Air Accidents Investigation Branch (AAIB) update this month. The incident report was picked up by The Register.
Learning what to share between loosely related tasks
Ruder, Sebastian, Bingel, Joachim, Augenstein, Isabelle, Søgaard, Anders
Multi-task learning is motivated by the observation that humans bring to bear what they know about related problems when solving new ones. Similarly, deep neural networks can profit from related tasks by sharing parameters with other networks. However, humans do not consciously decide to transfer knowledge between tasks. In Natural Language Processing (NLP), it is hard to predict if sharing will lead to improvements, particularly if tasks are only loosely related. To overcome this, we introduce Sluice Networks, a general framework for multi-task learning where trainable parameters control the amount of sharing. Our framework generalizes previous proposals in enabling sharing of all combinations of subspaces, layers, and skip connections. We perform experiments on three task pairs, and across seven different domains, using data from OntoNotes 5.0, and achieve up to 15% average error reductions over common approaches to multi-task learning. We show that a) label entropy is predictive of gains in sluice networks, confirming findings for hard parameter sharing and b) while sluice networks easily fit noise, they are robust across domains in practice.
Cooperating with Machines
Crandall, Jacob W., Oudah, Mayada, Tennom, null, Ishowo-Oloko, Fatimah, Abdallah, Sherief, Bonnefon, Jean-François, Cebrian, Manuel, Shariff, Azim, Goodrich, Michael A., Rahwan, Iyad
Since Alan Turing envisioned Artificial Intelligence (AI) [1], a major driving force behind technical progress has been competition with human cognition. Historical milestones have been frequently associated with computers matching or outperforming humans in difficult cognitive tasks (e.g. face recognition [2], personality classification [3], driving cars [4], or playing video games [5]), or defeating humans in strategic zero-sum encounters (e.g. Chess [6], Checkers [7], Jeopardy! [8], Poker [9], or Go [10]). In contrast, less attention has been given to developing autonomous machines that establish mutually cooperative relationships with people who may not share the machine's preferences. A main challenge has been that human cooperation does not require sheer computational power, but rather relies on intuition [11], cultural norms [12], emotions and signals [13, 14, 15, 16], and pre-evolved dispositions toward cooperation [17], common-sense mechanisms that are difficult to encode in machines for arbitrary contexts. Here, we combine a state-of-the-art machine-learning algorithm with novel mechanisms for generating and acting on signals to produce a new learning algorithm that cooperates with people and other machines at levels that rival human cooperation in a variety of two-player repeated stochastic games. This is the first general-purpose algorithm that is capable, given a description of a previously unseen game environment, of learning to cooperate with people within short timescales in scenarios previously unanticipated by algorithm designers. This is achieved without complex opponent modeling or higher-order theories of mind, thus showing that flexible, fast, and general human-machine cooperation is computationally achievable using a non-trivial, but ultimately simple, set of algorithmic mechanisms.
Long-term Blood Pressure Prediction with Deep Recurrent Neural Networks
Su, Peng, Ding, Xiao-Rong, Zhang, Yuan-Ting, Liu, Jing, Miao, Fen, Zhao, Ni
As a result, these models suffer from accuracy decay over a long time and thus require frequent calibration. In this work, we address this issue by formulating BP estimation as a sequence prediction problem in which both the input and target are temporal sequences. We propose a novel deep recurrent neural network (RNN) consisting of multilayered Long Short-T erm Memory (LSTM) networks, which are incorporated with (1) a bidirectional structure to access larger-scale context information of input sequence, and (2) residual connections to allow gradients in deep RNN to propagate more effectively. The proposed deep RNN model was tested on a static BP dataset, and it achieved root mean square error (RMSE) of 3.90 and 2.66 mmHg for systolic BP (SBP) and diastolic BP (DBP) prediction respectively, surpassing the accuracy of traditional BP prediction models. On a multi-day BP dataset, the deep RNN achieved RMSE of 3.84, 5.25, 5.80 and 5.81 mmHg for the 1st day, 2nd day, 4th day and 6th month after the 1st day SBP prediction, and 1.80, 4.78, 5.0, 5.21 mmHg for corresponding DBP prediction, respectively, which outperforms all previous models with notable improvement. The experimental results suggest that modeling the temporal dependencies in BP dynamics significantly improves the long-term BP prediction accuracy.
A Review of 40 Years of Cognitive Architecture Research: Core Cognitive Abilities and Practical Applications
Kotseruba, Iuliia, Tsotsos, John K.
In this paper we present a broad overview of the last 40 years of research on cognitive architectures. Although the number of existing architectures is nearing several hundred, most of the existing surveys do not reflect this growth and focus on a handful of well-established architectures. Thus, in this survey we wanted to shift the focus towards a more inclusive and high-level overview of the research on cognitive architectures. Our final set of 84 architectures includes 49 that are still actively developed, and borrow from a diverse set of disciplines, spanning areas from psychoanalysis to neuroscience. To keep the length of this paper within reasonable limits we discuss only the core cognitive abilities, such as perception, attention mechanisms, action selection, memory, learning and reasoning. In order to assess the breadth of practical applications of cognitive architectures we gathered information on over 900 practical projects implemented using the cognitive architectures in our list. We use various visualization techniques to highlight overall trends in the development of the field. In addition to summarizing the current state-of-the-art in the cognitive architecture research, this survey describes a variety of methods and ideas that have been tried and their relative success in modeling human cognitive abilities, as well as which aspects of cognitive behavior need more research with respect to their mechanistic counterparts and thus can further inform how cognitive science might progress.
OptNet: Differentiable Optimization as a Layer in Neural Networks
Amos, Brandon, Kolter, J. Zico
This paper presents OptNet, a network architecture that integrates optimization problems (here, specifically in the form of quadratic programs) as individual layers in larger end-to-end trainable deep networks. These layers encode constraints and complex dependencies between the hidden states that traditional convolutional and fully-connected layers often cannot capture. In this paper, we explore the foundations for such an architecture: we show how techniques from sensitivity analysis, bilevel optimization, and implicit differentiation can be used to exactly differentiate through these layers and with respect to layer parameters; we develop a highly efficient solver for these layers that exploits fast GPU-based batch solves within a primal-dual interior point method, and which provides backpropagation gradients with virtually no additional cost on top of the solve; and we highlight the application of these approaches in several problems. In one notable example, we show that the method is capable of learning to play mini-Sudoku (4x4) given just input and output games, with no a priori information about the rules of the game; this highlights the ability of our architecture to learn hard constraints better than other neural architectures.
A Model of Multi-Agent Consensus for Vague and Uncertain Beliefs
Crosscombe, Michael, Lawry, Jonathan
Consensus formation is investigated for multi-agent systems in which agents' beliefs are both vague and uncertain. Vagueness is represented by a third truth state meaning \emph{borderline}. This is combined with a probabilistic model of uncertainty. A belief combination operator is then proposed which exploits borderline truth values to enable agents with conflicting beliefs to reach a compromise. A number of simulation experiments are carried out in which agents apply this operator in pairwise interactions, under the bounded confidence restriction that the two agents' beliefs must be sufficiently consistent with each other before agreement can be reached. As well as studying the consensus operator in isolation we also investigate scenarios in which agents are influenced either directly or indirectly by the state of the world. For the former we conduct simulations which combine consensus formation with belief updating based on evidence. For the latter we investigate the effect of assuming that the closer an agent's beliefs are to the truth the more visible they are in the consensus building process. In all cases applying the consensus operators results in the population converging to a single shared belief which is both crisp and certain. Furthermore, simulations which combine consensus formation with evidential updating converge faster to a shared opinion which is closer to the actual state of the world than those in which beliefs are only changed as a result of directly receiving new evidence. Finally, if agent interactions are guided by belief quality measured as similarity to the true state of the world, then applying the consensus operator alone results in the population converging to a high quality shared belief.
Controlling for Unobserved Confounds in Classification Using Correlational Constraints
Landeiro, Virgile, Culotta, Aron
As statistical classifiers become integrated into real-world applications, it is important to consider not only their accuracy but also their robustness to changes in the data distribution. In this paper, we consider the case where there is an unobserved confounding variable $z$ that influences both the features $\mathbf{x}$ and the class variable $y$. When the influence of $z$ changes from training to testing data, we find that the classifier accuracy can degrade rapidly. In our approach, we assume that we can predict the value of $z$ at training time with some error. The prediction for $z$ is then fed to Pearl's back-door adjustment to build our model. Because of the attenuation bias caused by measurement error in $z$, standard approaches to controlling for $z$ are ineffective. In response, we propose a method to properly control for the influence of $z$ by first estimating its relationship with the class variable $y$, then updating predictions for $z$ to match that estimated relationship. By adjusting the influence of $z$, we show that we can build a model that exceeds competing baselines on accuracy as well as on robustness over a range of confounding relationships.