AITopics

1606.07025

Country:

North America > United States (0.46)
Europe > United Kingdom (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Jain, Lalit, Jamieson, Kevin, Nowak, Robert

Finite Sample Prediction and Recovery Bounds for Ordinal Embedding

arXiv.org Machine LearningJun-22-2016

The goal of ordinal embedding is to represent items as points in a low-dimensional Euclidean space given a set of constraints in the form of distance comparisons like "item $i$ is closer to item $j$ than item $k$". Ordinal constraints like this often come from human judgments. To account for errors and variation in judgments, we consider the noisy situation in which the given constraints are independently corrupted by reversing the correct constraint with some probability. This paper makes several new contributions to this problem. First, we derive prediction error bounds for ordinal embedding with noise by exploiting the fact that the rank of a distance matrix of points in $\mathbb{R}^d$ is at most $d+2$. These bounds characterize how well a learned embedding predicts new comparative judgments. Second, we investigate the special case of a known noise model and study the Maximum Likelihood estimator. Third, knowledge of the noise model enables us to relate prediction errors to embedding accuracy. This relationship is highly non-trivial since we show that the linear map corresponding to distance comparisons is non-invertible, but there exists a nonlinear map that is invertible. Fourth, two new algorithms for ordinal embedding are proposed and evaluated in experiments.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1606.07081

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

#artificialintelligenceJun-21-2016, 01:46:02 GMT

Bayesian Statistics explained to Beginners in Simple English

Bayesian Statistics continues to remain incomprehensible in the ignited minds of many analysts. Being amazed by the incredible power of machine learning, a lot of us have become unfaithful to statistics. Our focus has narrowed down to exploring machine learning. We fail to understand that machine learning is only one way to solve real world problems. In several situations, it does not help us solve business problems, even though there is data involved in these problems. To say the least, knowledge of statistics will allow you to work on complex analytical problems, irrespective of the size of data. In 1770s, Thomas Bayes introduced'Bayes Theorem'.

artificial intelligence, bayesian inference, machine learning, (18 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Artificial IntelligenceJun-21-2016

Variable Elimination in the Fourier Domain

Xue, Yexiang, Ermon, Stefano, Bras, Ronan Le, Gomes, Carla P., Selman, Bart

The ability to represent complex high dimensional probability distributions in a compact form is one of the key insights in the field of graphical models. Factored representations are ubiquitous in machine learning and lead to major computational advantages. We explore a different type of compact representation based on discrete Fourier representations, complementing the classical approach based on conditional independencies. We show that a large class of probabilistic graphical models have a compact Fourier representation. This theoretical result opens up an entirely new way of approximating a probability distribution. We demonstrate the significance of this approach by applying it to the variable elimination algorithm. Compared with the traditional bucket representation and other approximate inference algorithms, we obtain significant improvements.

algorithm, fourier representation, representation, (15 more...)

arXiv.org Artificial Intelligence

1508.04032

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Saparov, Abulhair, Mitchell, Tom M.

A Probabilistic Generative Grammar for Semantic Parsing

We present a framework that couples the syntax and semantics of natural language sentences in a generative model, in order to develop a semantic parser that jointly infers the syntactic, morphological, and semantic representations of a given sentence under the guidance of background knowledge. To generate a sentence in our framework, a semantic statement is first sampled from a prior, such as from a set of beliefs in a knowledge base. Given this semantic statement, a grammar probabilistically generates the output sentence. A joint semantic-syntactic parser is derived that returns the $k$-best semantic and syntactic parses for a given sentence. The semantic prior is flexible, and can be used to incorporate background knowledge during parsing, in ways unlike previous semantic parsing approaches. For example, semantic statements corresponding to beliefs in a knowledge base can be given higher prior probability, type-correct statements can be given somewhat lower probability, and beliefs outside the knowledge base can be given lower probability. The construction of our grammar invokes a novel application of hierarchical Dirichlet processes (HDPs), which in turn, requires a novel and efficient inference approach. We present experimental results showing, for a simple grammar, that our parser outperforms a state-of-the-art CCG semantic parser and scales to knowledge bases with millions of beliefs.

artificial intelligence, machine learning, natural language, (17 more...)

1606.06361

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (0.48)
Overview > Innovation (0.34)

Industry: Leisure & Entertainment > Sports > Tennis (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Masood, Arjumand, Pan, Weiwei, Doshi-Velez, Finale

An Empirical Comparison of Sampling Quality Metrics: A Case Study for Bayesian Nonnegative Matrix Factorization

In this work, we empirically explore the question: how can we assess the quality of samples from some target distribution? We assume that the samples are provided by some valid Monte Carlo procedure, so we are guaranteed that the collection of samples will asymptotically approximate the true distribution. Most current evaluation approaches focus on two questions: (1) Has the chain mixed, that is, is it sampling from the distribution? and (2) How independent are the samples (as MCMC procedures produce correlated samples)? Focusing on the case of Bayesian nonnegative matrix factorization, we empirically evaluate standard metrics of sampler quality as well as propose new metrics to capture aspects that these measures fail to expose. The aspect of sampling that is of particular interest to us is the ability (or inability) of sampling methods to move between multiple optima in NMF problems. As a proxy, we propose and study a number of metrics that might quantify the diversity of a set of NMF factorizations obtained by a sampler through quantifying the coverage of the posterior distribution. We compare the performance of a number of standard sampling methods for NMF in terms of these new metrics.

artificial intelligence, bayesian inference, machine learning, (15 more...)

1606.0625

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Tang, Bo, Baggenstoss, Paul M., He, Haibo

Kernel-based Generative Learning in Distortion Feature Space

This paper presents a novel kernel-based generative classifier which is defined in a distortion subspace using polynomial series expansion, named Kernel-Distortion (KD) classifier. An iterative kernel selection algorithm is developed to steadily improve classification performance by repeatedly removing and adding kernels. The experimental results on character recognition application not only show that the proposed generative classifier performs better than many existing classifiers, but also illustrate that it has different recognition capability compared to the state-of-the-art discriminative classifier - deep belief network. The recognition diversity indicates that a hybrid combination of the proposed generative classifier and the discriminative classifier could further improve the classification performance. Two hybrid combination methods, cascading and stacking, have been implemented to verify the diversity and the improvement of the proposed classifier. Keywords: Distortion feature space, kernel-based generative classifier, hybrid classification, deep belief nets, character recognition 1. Introduction Learning and inference are two important aspects for any machine learning application.

classifier, machine learning, pattern recognition, (20 more...)

1606.06377

Country: North America > United States (0.69)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
(2 more...)

FSMJ: Feature Selection with Maximum Jensen-Shannon Divergence for Text Categorization

Tang, Bo, He, Haibo

In this paper, we present a new wrapper feature selection approach based on Jensen-Shannon (JS) divergence, termed feature selection with maximum JS-divergence (FSMJ), for text categorization. Unlike most existing feature selection approaches, the proposed FSMJ approach is based on real-valued features which provide more information for discrimination than binary-valued features used in conventional approaches. We show that the FSMJ is a greedy approach and the JS-divergence monotonically increases when more features are selected. We conduct several experiments on real-life data sets, compared with the state-of-the-art feature selection approaches for text categorization. The superior performance of the proposed FSMJ approach demonstrates its effectiveness and further indicates its wide potential applications on data mining.

machine learning, natural language, text categorization, (15 more...)

1606.06366

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

#artificialintelligenceJun-18-2016, 11:40:37 GMT

Fuzzy logic helps detect redirection spam

Web browsers might soon use fuzzy logic to spot redirection spam and save users from being scammed, phished or opening malicious sites unwittingly, according to researchers in India writing in the International Journal of Electronic Security and Digital Forensics. Redirection spam occurs when a user opens a link in an email that leads to an unexpected and often malicious page, or when they open a page that has been hacked or injected with malware, which then redirects to a malicious page. Often the redirection occurs instantaneously and transparently without the user being aware until it is too late and login details or credit card number have been divulged to the criminal third party. Frequently, there will be a malware payload that infects the user's computer at the same time. According to Kanchan Hans of Amity University, in Noida, India, and colleagues, legitimate web page redirections are a ubiquitous part of the web used for server load balancing, link logging and URL rewriting and shortening.

artificial intelligence, fuzzy logic, redirection spam, (9 more...)

#artificialintelligence

Country:

Asia > India (0.47)
North America > United States > Washington (0.05)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.66)
Information Technology > Communications > Web (0.58)

arXiv.org Artificial IntelligenceJun-18-2016

Discovery and Visualization of Nonstationary Causal Models

Zhang, Kun, Huang, Biwei, Zhang, Jiji, Schölkopf, Bernhard, Glymour, Clark

It is commonplace to encounter nonstationary data, of which the underlying generating process may change over time or across domains. The nonstationarity presents both challenges and opportunities for causal discovery. In this paper we propose a principled framework to handle nonstationarity, and develop some methods to address three important questions. First, we propose an enhanced constraint-based method to detect variables whose local mechanisms are nonstationary and recover the skeleton of the causal structure over observed variables. Second, we present a way to determine some causal directions by taking advantage of information carried by changing distributions. Third, we develop a method for visualizing the nonstationarity of causal modules. Experimental results on various synthetic and real-world data sets are presented to demonstrate the efficacy of our methods.

artificial intelligence, causal direction, machine learning, (17 more...)

arXiv.org Artificial Intelligence

1509.08056

Country:

North America > United States (0.46)
North America > Canada (0.28)
Europe > Germany (0.28)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (1.00)

Industry:

Banking & Finance (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)