"The field of Machine Learning seeks to answer these questions: How can we build computer systems that automatically improve with experience, and what are the fundamental laws that govern all learning processes?"
– from The Discipline of Machine Learning by Tom Mitchell. CMU-ML-06-108, 2006.
When school began in Lockport, New York, this past fall, the halls were lined not just with posters and lockers, but cameras. Over the summer, a brand new $4 million facial recognition system was installed by the school district in the town's eight schools from elementary to high school. The system scans the faces of students as they roam the halls, looking for faces that have been uploaded and flagged as dangerous. "Any way that we can improve safety and security in schools is always money well spent," David Lowry, president of the Lockport Education Association, told the Lockport Union-Sun & Journal. Rose Eveleth is an Ideas contributor at WIRED and the creator and host of Flash Forward, a podcast about possible (and not so possible) futures.
The internet is full of lies. That maxim has become an operating assumption for any remotely skeptical person interacting anywhere online, from Facebook and Twitter to phishing-plagued inboxes to spammy comment sections to online dating and disinformation-plagued media. Now one group of researchers has suggested the first hint of a solution: They claim to have built a prototype for an "online polygraph" that uses machine learning to detect deception from text alone. But what they've actually demonstrated, according to a few machine learning academics, is the inherent danger of overblown machine learning claims. In last month's issue of the journal Computers in Human Behavior, Florida State University and Stanford researchers proposed a system that uses automated algorithms to separate truths and lies, what they refer to as the first step toward "an online polygraph system--or a prototype detection system for computer-mediated deception when face-to-face interaction is not available."
With any change comes the fear of the unknown, but this is especially true when it comes to artificial intelligence. Universities today have so much to gain by leveraging AI across the student lifecycle, but many are hesitant. Taking a step back, this somewhat nebulous concept of AI is already taking root in our everyday lives in so many forms. Today, you can wake up with a reminder and a playlist of your favorite motivational morning music via a voice-activated assistant, then get traffic advice on your way to work from a maps app. A quick tap on a suggestion based on previous purchases, and your favorite variety of coffee is waiting at your favorite store, already paid for in-app.
The dataset contains 367,888 face annotations for 8,277 subjects divided into 3 batches. We provide human curated bounding boxes for faces. We also provide the estimated pose (yaw, pitch, and roll), locations of twenty-one keypoints, and gender information generated by a pre-trained neural network. In addition, we also release a new face verification test protocol based on batch 3. Part 2 - Video Frames The second part contains 3,735,476 annotated video frames extracted from a total of 22,075 for 3,107 subjects. Again, we also provide the estimated pose (yaw, pitch, and roll), locations of twenty-one keypoints, and gender information generated by a pre-trained neural network.
DeepMind's Research Platform Team has open-sourced TF-Replicator, a framework that enables researchers without previous experience with the distributed system to deploy their TensorFlow models on GPUs and Cloud TPUs. The move aims to strengthen AI research and development. Synced invited Yuan Tang, a senior software engineer at Ant Financial, to share his thoughts on TF-Replicator. How would you describe TF-Replicator? TF-Replicator is a framework to simplify the writing of distributed TensorFlow code for training machine learning models, so that they can be effortlessly deployed to different cluster architectures.
In this paper, we present exploitability descent, a new algorithm to compute approximate equilibria in two-player zero-sum extensive-form games with imperfect information, by direct policy optimization against worst-case opponents. We prove that when following this optimization, the exploitability of a player's strategy converges asymptotically to zero, and hence when both players employ this optimization, the joint policies converge to a Nash equilibrium. Unlike fictitious play (XFP) and counterfactual regret minimization (CFR), our convergence result pertains to the policies being optimized rather than the average policies. Our experiments demonstrate convergence rates comparable to XFP and CFR in four benchmark games in the tabular case. Using function approximation, we find that our algorithm outperforms the tabular version in two of the games, which, to the best of our knowledge, is the first such result in imperfect information games among this class of algorithms.
Dialogue systems have become recently essential in our life. Their use is getting more and more fluid and easy throughout the time. This boils down to the improvements made in NLP and AI fields. In this paper, we try to provide an overview to the current state of the art of dialogue systems, their categories and the different approaches to build them. We end up with a discussion that compares all the techniques and analyzes the strengths and weaknesses of each. Finally, we present an opinion piece suggesting to orientate the research towards the standardization of dialogue systems building.
This paper describes our system submitted to SemEval 2019 Task 7: RumourEval 2019: Determining Rumour Veracity and Support for Rumours, Subtask A (Gorrell et al., 2019). The challenge focused on classifying whether posts from Twitter and Reddit support, deny, query, or comment a hidden rumour, truthfulness of which is the topic of an underlying discussion thread. We formulate the problem as a stance classification, determining the rumour stance of a post with respect to the previous thread post and the source thread post. The recent BERT architecture was employed to build an end-to-end system which has reached the F1 score of 61.67% on the provided test data. It finished at the 2nd place in the competition, without any hand-crafted features, only 0.2% behind the winner.
We study strategy synthesis for partially observable Markov decision processes (POMDPs). The particular problem is to determine strategies that provably adhere to (probabilistic) temporal logic constraints. This problem is computationally intractable and theoretically hard. We propose a novel method that combines techniques from machine learning and formal verification. First, we train a recurrent neural network (RNN) to encode POMDP strategies. The RNN accounts for memory-based decisions without the need to expand the full belief space of a POMDP. Secondly, we restrict the RNN-based strategy to represent a finite-memory strategy and implement it on a specific POMDP. For the resulting finite Markov chain, efficient formal verification techniques provide provable guarantees against temporal logic specifications. If the specification is not satisfied, counterexamples supply diagnostic information. We use this information to improve the strategy by iteratively training the RNN. Numerical experiments show that the proposed method elevates the state of the art in POMDP solving by up to three orders of magnitude in terms of solving times and model sizes.
The most common failure algorithms for control, employs three techniques mode is divergence, where the Q-function approximator collectively known as the'deadly triad' in learns to ascribe unrealistically high values to state-action reinforcement learning: bootstrapping, off-policy pairs, in turn destroying the quality of the greedy control learning, and function approximation. Prior work policy derived from Q (van Hasselt et al., 2018). Divergence has demonstrated that together these can lead to in DQL is often attributed to three components common divergence in Q-learning algorithms, but the conditions to all DQL algorithms, which are collectively considered under which divergence occurs are not the'deadly triad' of reinforcement learning (Sutton, 1988; well-understood. In this note, we give a simple Sutton & Barto, 2018): analysis based on a linear approximation to the Q-value updates, which we believe provides insight - function approximation, in this case the use of deep into divergence under the deadly triad. The neural networks, central point in our analysis is to consider when the leading order approximation to the deep-Q - off-policy learning, the use of data collected on one update is or is not a contraction in the sup norm.