AITopics

Ceritli, Taha, Williams, Christopher K. I., Geddes, James

ptype: Probabilistic Type Inference

arXiv.org Machine LearningNov-22-2019

The data type, missing data and, anomalies can be defined in broad terms as follows: The data type is the common characteristic that is expected to be shared by entries in a column, such as integers, strings, IP addresses, dates, etc., while missing data denotes an absence of a data value which can be encoded in various ways, and anomalies refer to values whose types differ from the given column type or the missing type. In order to model above types, we have developed PFSMs that can generate values from the corresponding domains. This, in turn, allows us to calculate the probability of a given data value being generated by a particular PFSM. We then combine these PFSMs in our model such that a data column x can be annotated via probabilistic inference in the proposed model, i.e., given a column of data, we can infer column type, and rows with missing and anomalous values.

column type, data type, probability, (17 more...)

arXiv.org Machine Learning

1911.10081

Country:

Oceania > Australia > Australian Capital Territory > Canberra (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(2 more...)

Chatterjee, Oishik, Ramakrishnan, Ganesh, Sarawagi, Sunita

Data Programming using Continuous and Quality-Guided Labeling Functions

arXiv.org Machine LearningNov-22-2019

Sunita Sarawagi Department of CSE IIT Bombay, India sunita@iitb.ac.in Abstract Scarcity of labeled data is a bottleneck for supervised learning models. A paradigm that has evolved for dealing with this problem is data programming. An existing data programming paradigm allows human supervision to be provided as a set of discrete labeling functions (LF) that output possibly noisy labels to input instances and a generative model for consolidating the weak labels. We enhance and generalize this paradigm by supporting functions that output a continuous score (instead of a hard label) that noisily correlates with labels. We show across five applications that continuous LFs are more natural to program and lead to improved recall. We also show that accuracy of existing generative models is unstable with respect to initialization, training epochs, and learning rates. We give control to the data programmer to guide the training process by providing intuitive quality guides with each LF. We propose an elegant method of incorporating these guides into the generative model. Our overall method, called CAGE, makes the data programming paradigm more reliable than other tricks based on initialization, sign-penalties, or soft-accuracy constraints. 1 Introduction Modern machine learning systems require large amounts of labelled data. For many applications, such labelled data is created by getting humans to explicitly label each training example. A problem of perpetual interest in machine learning is reducing the tedium of such human supervision via techniques like active learning, crowd-labeling, distant supervision, and semi-supervised learning.

accuracy, continuous lf, quality guide, (16 more...)

arXiv.org Machine Learning

1911.0986

Country:

Asia > India (0.25)
North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > Victoria > Melbourne (0.04)
(14 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Simpson, Edwin, Gao, Yang, Gurevych, Iryna

Interactive Text Ranking with Bayesian Optimisation: A Case Study on Community QA and Summarisation

For many NLP applications, such as question answering and summarisation, the goal is to select the best solution from a large space of candidates to meet a particular user's needs. To address the lack of user-specific training data, we propose an interactive text ranking approach that actively selects pairs of candidates, from which the user selects the best. Unlike previous strategies, which attempt to learn a ranking across the whole candidate space, our method employs Bayesian optimisation to focus the user's labelling effort on high quality candidates and integrates prior knowledge in a Bayesian manner to cope better with small data scenarios. We apply our method to community question answering (cQA) and extractive summarisation, finding that it significantly outperforms existing interactive approaches. We also show that the ranking function learned by our method is an effective reward function for reinforcement learning, which improves the state of the art for interactive summarisation.

computational linguistic, learning, prediction, (15 more...)

1911.10183

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Belgium > Brussels-Capital Region > Brussels (0.05)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(9 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Government (1.00)
Law (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.67)

Duhamel, Thibault, Maynard, Mariane, Kabanza, Froduald

A Transfer Learning Method for Goal Recognition Exploiting Cross-Domain Spatial Features

The ability to infer the intentions of others, predict their goals, and deduce their plans are critical features for intelligent agents. For a long time, several approaches investigated the use of symbolic representations and inferences with limited success, principally because it is difficult to capture the cognitive knowledge behind human decisions explicitly. The trend, nowadays, is increasingly focusing on learning to infer intentions directly from data, using deep learning in particular. We are now observing interesting applications of intent classification in natural language processing, visual activity recognition, and emerging approaches in other domains. This paper discusses a novel approach combining few-shot and transfer learning with cross-domain features, to learn to infer the intent of an agent navigating in physical environments, executing arbitrary long sequences of actions to achieve their goals. Experiments in synthetic environments demonstrate improved performance in terms of learning from few samples and generalizing to unseen configurations, compared to a deep-learning baseline approach.

configuration, goal recognition, recognition, (17 more...)

1911.10134

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Oceania > Australia > New South Wales > Sydney (0.04)
North America > Canada (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Computer Games (0.94)

Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy

Qu, Xinghua, Sun, Zhu, Ong, Yew-Soon, Wei, Pengfei, Gupta, Abhishek

Recent studies have revealed that neural network-based policies can be easily fooled by adversarial examples. However, while most prior works analyze the effects of perturbing every pixel of every frame assuming white-box policy access, in this paper we take a more restrictive view towards adversary generation - with the goal of unveiling the limits of a model's vulnerability. In particular, we explore minimalistic attacks by defining three key settings: (1) black-box policy access: where the attacker only has access to the input (state) and output (action probability) of an RL policy; (2) fractional-state adversary: where only several pixels are perturbed, with the extreme case being a single-pixel adversary; and (3) tactically-chanced attack: where only significant frames are tactically chosen to be attacked. We formulate the adversarial attack by accommodating the three key settings and explore their potency on six Atari games by examining four fully trained state-of-the-art policies. In Breakout, for example, we surprisingly find that: (i) all policies showcase significant performance degradation by merely modifying 0.01% of the input state, and (ii) the policy trained by DQN is totally deceived by perturbation to only 1% frames.

adversarial attack, breakout, ieee transaction, (13 more...)

1911.03849

Country:

Asia > China (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
Europe > United Kingdom > England > Hampshire > Southampton (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Games > Computer Games (0.70)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Learning Internal Representations (PhD Thesis)

Baxter, Jonathan

Most machine learning theory and practice is concerned with learning a single task. In this thesis it is argued that in general there is insufficient information in a single task for a learner to generalise well and that what is required for good generalisation is information about many similar learning tasks. Similar learning tasks form a body of prior information that can be used to constrain the learner and make it generalise better. Examples of learning scenarios in which there are many similar tasks are handwritten character recognition and spoken word recognition. The concept of the environment of a learner is introduced as a probability measure over the set of learning problems the learner might be expected to learn. It is shown how a sample from the environment may be used to learn a representation, or recoding of the input space that is appropriate for the environment. Learning a representation can equivalently be thought of as learning the appropriate features of the environment. Bounds are derived on the sample size required to ensure good generalisation from a representation learning process. These bounds show that under certain circumstances learning a representation appropriate for $n$ tasks reduces the number of examples required of each task by a factor of $n$. Once a representation is learnt it can be used to learn novel tasks from the same environment, with the result that far fewer examples are required of the new tasks to ensure good generalisation. Bounds are given on the number of tasks and the number of samples from each task required to ensure that a representation will be a good one for learning novel tasks. The results on representation learning are generalised to cover any form of automated hypothesis space bias.

empirical loss, learner, representation, (11 more...)

1911.03731

Country:

Oceania > Australia > South Australia (0.14)
North America > United States > New York (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

#artificialintelligenceNov-21-2019, 22:41:46 GMT

Managing Marketing: Realising The Full Value Of Customer Experience With AI (Artificial Intelligence)

Mercer Bell is a customer experience agency. Technically, we were the first in this market as far as being a trademark CX agency. What does that mean nowadays? Nowadays it's a really big complicated broad church of things we do for our clients, including working with aspects of AI. That is everything from deploying it for our clients on an ongoing basis, helping clients message features of artificial intelligence to their clients, and then actually building bespoke things, particularly in the machine learning space for our clients on an ongoing basis.

artificial intelligence, machine learning, marketer, (12 more...)

#artificialintelligence

Country:

North America > United States > New York (0.04)
Oceania > Australia (0.04)

Industry:

Information Technology (0.68)
Law (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.46)

#artificialintelligenceNov-21-2019, 15:45:10 GMT

Investorideas.com Newswire - AI News: VSBLTY (CSE: VSBY) (OTC: VSBGF) Launches Two Security Initiatives to Reduce Crime and Make South African Communities Safer

Newswire) VSBLTY Groupe Technologies Corp. (CSE: VSBY) (5VS.F) (VSBGF), a leading retail software and technology company, announced today that-in partnership with Onyx-Cognivas Pty.-it is launching two privately-led security deployments in South Africa to support community safety initiatives. The state-of-the-art security technology will protect two prominent high-rise residential apartment buildings in the upmarket Sandton area, a high income residential, financial and business suburb of Johannesburg with a population of 225,000. The rollout plan is to deploy this technology across several apartment blocks, a hotel and commercial properties in the precinct-with the objective of deploying a "private Smart City". In addition, advanced custom sensory applications are planned to be installed in a well-known petroleum group with convenience stores/service stations throughout South Africa. The announcement was made by Jay Hutton, VSBLTY co-founder and CEO, who said, "We are excited to provide complete Smart City-like security solutions in Sandton. This state-of-the-art technology uses the power of machine learning and computer vision."

investoridea, make south african community safer, vsblty, (12 more...)

#artificialintelligence

Country:

Africa > South Africa > Gauteng > Johannesburg (0.26)
Oceania > Australia (0.05)

Genre: Press Release (1.00)

Industry:

Information Technology (0.91)
Banking & Finance > Real Estate (0.71)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.52)
Media > News (0.51)

Technology:

Information Technology > Artificial Intelligence > Vision (0.51)
Information Technology > Artificial Intelligence > Machine Learning (0.37)
Information Technology > Communications > Social Media (0.32)

#artificialintelligenceNov-21-2019, 14:03:32 GMT

The Week It Snowed Everywhere - New Zealand News Centre

NIWA and Microsoft Corp. are teaming up to make artificial intelligence handwriting recognition more accurate and efficient in a project that will support climate research. The project aims to develop better training sets for handwriting recognition technology that will "read" old weather logs. The first step is to use weather information recorded during a week in July 1939 when it snowed all over New Zealand, including at Cape Reinga. NIWA climate scientist Dr. Andrew Lorrey says the project has the potential to revolutionise how historic data can be used. Microsoft has awarded NIWA an AI for Earth grant for the artificial intelligence project, which will support advances in automating handwriting recognition.

handwriting recognition, lorrey, new zealand news centre, (8 more...)

#artificialintelligence

Country: Oceania > New Zealand > North Island > Auckland Region > Auckland (0.05)

Industry:

Information Technology (0.36)
Food & Agriculture > Agriculture (0.31)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.37)