r/MachineLearning - [R] Faster AutoAugment: Learning Augmentation Strategies using Backpropagation


Abstract: Data augmentation methods are indispensable heuristics to boost the performance of deep neural networks, especially in image recognition tasks. Recently, several studies have shown that augmentation strategies found by search algorithms outperform hand-made strategies. Such methods employ black-box search algorithms over image transformations with continuous or discrete parameters and require a long time to obtain better strategies. In this paper, we propose a differentiable policy search pipeline for data augmentation, which is much faster than previous methods. We introduce approximate gradients for several transformation operations with discrete parameters as well as the differentiable mechanism for selecting operations.

r/MachineLearning - [R] Soft-Label Dataset Distillation and Text Dataset Distillation


Dataset distillation is a method for reducing dataset sizes by learning a small number of synthetic samples containing all the information of a large dataset. This has several benefits like speeding up model training, reducing energy consumption, and reducing required storage space. Currently, each synthetic sample is assigned a single hard' label, and also, dataset distillation can currently only be used with image data. We propose to simultaneously distill both images and their labels, thus assigning each synthetic sample a soft' label (a distribution of labels). Using soft' labels also enables distilled datasets to consist of fewer samples than there are classes as each sample can encode information for multiple classes.

Defending Against Neural Fake News

Neural Information Processing Systems

Recent progress in natural language generation has raised dual-use concerns. While applications like summarization and translation are positive, the underlying technology also might enable adversaries to generate neural fake news: targeted propaganda that closely mimics the style of real news. Modern computer security relies on careful threat modeling: identifying potential threats and vulnerabilities from an adversary's point of view, and exploring potential mitigations to these threats. Likewise, developing robust defenses against neural fake news requires us first to carefully investigate and characterize the risks of these models. We thus present a model for controllable text generation called Grover.

In the new bot economy, cloud robotics and AI transform work and society in far-reaching ways - SiliconANGLE


As the calendar reaches its last month in 2019, the bot is hot. Research firm Tractica LLC has forecast that a combination of cloud-computing and robotic hardware, software and services will propel global revenue in the cloud robotics field from single digits to in excess of $170 billion within the next five years. This is about a lot more than having robots deliver concierge services or burritos. Bots are having a major impact on how over 1-billion active Instagram users channel posts to reach target audiences. And "Grinch bots" are reportedly dominating online traffic to retailer login pages this week to elbow out human shoppers for the best deals.



Most stuff here is just raw unstructured text data, if you are looking for annotated corpora or Treebanks refer to the sources at the bottom. Blog Authorship Corpus: consists of the collected posts of 19,320 bloggers gathered from in August 2004. Amazon Fine Food Reviews [Kaggle]: consists of 568,454 food reviews Amazon users left up to October 2012. ASAP Automated Essay Scoring [Kaggle]: For this competition, there are eight essay sets. Each of the sets of essays was generated from a single prompt.

Five fascinating ways AI is changing advertising - Videa


Like a lot of industries, artificial intelligence (AI) is changing advertising right before our eyes. Today, AI is getting attention for writing emotive TV scripts, targeting smart ads and using facial recognition to recommend products based on personal preferences. In each of these scenarios, it's helping advertising professionals do their jobs with intelligence backed by solid data; much more than humans have the time or capacity to analyze. And while there's some debate over how far some of these technologies should go and whether they violate privacy and keep customer data secure, there's a lot of excitement over its potential. So how is AI making an impact on the television advertising industry – and the people in it – today?

Global Artificial Intelligence in Law Market 2019 by Manufacturers, Countries, Type and Application …


The research report "Artificial Intelligence in Law Market– Global Industry Analysis 2019 – 2025" offers precise analytical information about the …

Deepfake video: It takes AI to beat AI


By now, most of us have shared a few chuckles over AI-generated deepfake videos, like those in which the face of comedian and impressionist Bill Hader gradually takes on the likenesses of Tom Cruise, Seth Rogen, and Arnold Schwarzenegger as he imitates the celebrities. We've seen actor Ryan Reynolds' mug superimposed over Gene Wilder's in the 1971 classic film "Willy Wonka & the Chocolate Factory." We've even marveled over businessman Elon Musk being turned into a baby. It all can be quite humorous, but not everyone is laughing. Tech companies, researchers, and politicians alike are growing concerned that the increasing sophistication of the artificial intelligence and machine learning technology powering deepfakes will outpace our ability to discern between genuine and doctored imagery.

Tesla on autopilot rear-ended Connecticut cop car as driver checked on dog: police

FOX News

Fox News Flash top headlines for Dec. 7 are here. Check out what's clicking on A Tesla on autopilot rear-ended a Connecticut trooper's vehicle early Saturday as the driver was checking on his dog in the back seat, state police said. Police said they had responded to a disabled vehicle that was stopped in the middle of Interstate 95. While waiting for a tow, the self-driving Tesla came down the road.