AITopics

1510.00112

Genre:

Research Report > New Finding (0.34)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Christmann, Andreas, Zhou, Ding-Xuan

On the Robustness of Regularized Pairwise Learning Methods Based on Kernels

arXiv.org Machine LearningOct-12-2015

Regularized empirical risk minimization including support vector machines plays an important role in machine learning theory. In this paper regularized pairwise learning (RPL) methods based on kernels will be investigated. One example is regularized minimization of the error entropy loss which has recently attracted quite some interest from the viewpoint of consistency and learning rates. This paper shows that such RPL methods have additionally good statistical robustness properties, if the loss function and the kernel are chosen appropriately. We treat two cases of particular interest: (i) a bounded and non-convex loss function and (ii) an unbounded convex loss function satisfying a certain Lipschitz type condition.

artificial intelligence, loss function, machine learning, (17 more...)

1510.03267

Country: Europe (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.54)

Mens, Irini-Eleftheria, Maler, Oded

Learning Regular Languages over Large Ordered Alphabets

arXiv.org Artificial IntelligenceSep-16-2015

This work is concerned with regular languages defined over large alphabets, either infinite or just too large to be expressed enumeratively. We define a generic model where transitions are labeled by elements of a finite partition of the alphabet. We then extend Angluin's L* algorithm for learning regular languages from examples for such automata. We have implemented this algorithm and we demonstrate its behavior where the alphabet is a subset of the natural or real numbers. We sketch the extension of the algorithm to a class of languages over partially ordered alphabets.

artificial intelligence, logic & formal reasoning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.2168/LMCS-11(3:13)2015

1506.00482

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(2 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Liang, Jia Hui, Ganesh, Vijay, Raman, Venkatesh, Czarnecki, Krzysztof

SAT-based Analysis of Large Real-world Feature Models is Easy

arXiv.org Artificial IntelligenceJul-28-2015

Modern conflict-driven clause-learning (CDCL) Boolean SAT solvers provide efficient automatic analysis of real-world feature models (FM) of systems ranging from cars to operating systems. It is well-known that solver-based analysis of real-world FMs scale very well even though SAT instances obtained from such FMs are large, and the corresponding analysis problems are known to be NP-complete. To better understand why SAT solvers are so effective, we systematically studied many syntactic and semantic characteristics of a representative set of large real-world FMs. We discovered that a key reason why large real-world FMs are easy-to-analyze is that the vast majority of the variables in these models are unrestricted, i.e., the models are satisfiable for both true and false assignments to such variables under the current partial assignment. Given this discovery and our understanding of CDCL SAT solvers, we show that solvers can easily find satisfying assignments for such models without too many backtracks relative to the model size, explaining why solvers scale so well. Further analysis showed that the presence of unrestricted variables in these real-world models can be attributed to their high-degree of variability. Additionally, we experimented with a series of well-known non-backtracking simplifications that are particularly effective in solving FMs. The remaining variables/clauses after simplifications, called the core, are so few that they are easily solved even with backtracking, further strengthening our conclusions.

feature model, real-world fms, solver, (15 more...)

arXiv.org Artificial Intelligence

1506.05198

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
North America > United States > District of Columbia > Washington (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

A Modularity-Based Random SAT Instances Generator

Giráldez-Cru, Jesús (IIIA-CSIC) | Levy, Jordi (IIIA-CSIC)

Nowadays, many industrial SAT instances can be solved efficiently by modern SAT solvers. However, the number of real-world instances is finite. Therefore, the process of development and test of SAT solving techniques can benefit of new models of random formulas that capture more realistically the features of real-world problems. In many works, the structure of industrial instances has been analyzed representing them as graphs and studying some of their properties, like modularity. In this paper, we use modularity, or community structure, to define a new model of pseudo-industrial random SAT instances, called Community Attachment. We prove that the phase transition point, if exists, is independent on the modularity. We evaluate the adequacy of this model to real industrial problems in terms of SAT solvers performance, and show that modern solvers do actually exploit this community structure.

formula, modularity, solver, (17 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country: Europe > Spain (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.57)

On Conceptual Labeling of a Bag of Words

Sun, Xiangyan (Fudan University) | Xiao, Yanghua (Fudan University) | Wang, Haixun (Google Research) | Wang, Wei (Fudan University)

In natural language processing and information retrieval, the bag of words representation is used to implicitly represent the meaning of the text. Implicit semantics, however, are insufficient in supporting text or natural language based interfaces, which are adopted by an increasing number of applications. Indeed, in applications ranging from automatic ontology construction to question answering, explicit representation of semantics is starting to play a more prominent role. In this paper, we introduce the task of conceptual labeling (CL), which aims at generating a minimum set of conceptual labels that best summarize a bag of words. We draw the labels from a data driven semantic network that contains millions of highly connected concepts. The semantic network provides meaning to the concepts, and in turn, it provides meaning to the bag of words through the conceptual labels we generate. To achieve our goal, we use an information theoretic approach to trade-off the semantic coverage of a bag of words against the minimality of the output labels. Specifically, we use Minimum Description Length (MDL) as the criteria in selecting the best concepts. Our extensive experimental results demonstrate the effectiveness of our approach in representing the explicit semantics of a bag of words.

code length, conceptual label, probase, (13 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Asia > China > Shanghai > Shanghai (0.05)
North America > United States (0.04)
Asia > India (0.04)
(4 more...)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Learning Cooperative Games

Balcan, Maria Florina (Carnegie-Mellon University) | Procaccia, Ariel D. (Carnegie-Mellon University) | Zick, Yair (Carnegie-Mellon University)

This paper explores a PAC (probably approximately correct) learning model in cooperative games. Specifically, we are given m random samples of coalitions and their values, taken from some unknown cooperative game; can we predict the values of unseen coalitions? We study the PAC learnability of several well-known classes of cooperative games, such as network flow games, threshold task games, and induced subgraph games. We also establish a novel connection between PAC learnability and core stability: for games that are efficiently learnable, it is possible to find payoff divisions that are likely to be stable using a polynomial number of samples.

algorithm, efficiently pac, pac learnable, (14 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Communications > Collaboration (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)

Efficient Algorithms with Performance Guarantees for the Stochastic Multiple-Choice Knapsack Problem

Tran-Thanh, Long (University of Southampton) | Xia, Yingce (University of Science and Technology of China) | Qin, Tao (Microsoft Research) | Jennings, Nicholas R (University of Southampton)

We study the stochastic multiple-choice knapsack problem, where a set of Kitems, whose value and weight are random variables, arrive to the system at each time step, and a decision maker has to choose at most one item to put into the knapsack without exceeding its capacity. The goal is the decision-maker is to maximise the total expected value of chosen items with respect to the knapsack capacity and a finite time horizon.We provide the first comprehensive theoretical analysis of the problem. In particular, we propose OPT-S-MCKP, the first algorithm that achieves optimality when the value-weight distributions are known. This algorithm also enjoys O(sqrt{T}) performance loss, where T is the finite time horizon, in the unknown value-weight distributions scenario.We also further develop two novel approximation methods, FR-S-MCKP and G-S-MCKP, and we prove that FR-S-MCKP achieves O(sqrt{T}) performance loss in both known and unknown value-weight distributions cases, while enjoying polynomial computational complexity per time step.On the other hand, G-S-MCKP does not have theoretical guarantees, but it still provides good performance in practice with linear running time.

algorithm, knapsack, value-weight distribution, (17 more...)

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country:

Europe > United Kingdom > England > Hampshire > Southampton (0.04)
Asia > China (0.04)

Industry: Education (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.63)

Liang, Tengyuan, Rakhlin, Alexander, Sridharan, Karthik

Learning with Square Loss: Localization through Offset Rademacher Complexity

arXiv.org Machine LearningJun-15-2015

Determining the finite-sample behavior of risk in the problem of regression is arguably one of the most basic problems of Learning Theory and Statistics. This behavior can be studied in substantial generality with the tools of empirical process theory. When functions in a given convex class are uniformly bounded, one may verify the socalled "Bernstein condition." The condition--which relates the variance of the increments of the empirical process to their expectation--implies a certain localization phenomenon around the optimum and forms the basis of the analysis via local Rademacher complexities. The technique has been developed in [9, 8, 5, 2, 4], among others, based on Talagrand's celebrated concentration inequality for the supremum of an empirical process. In a recent pathbreaking paper, [14] showed that a large part of this heavy machinery is not necessary for obtaining tight upper bounds on excess loss, even--and especially--if functions are unbounded. Mendelson observed that only one-sided control of the tail is required in the deviation inequality, and, thankfully, it is the tail that can be controlled under very mild assumptions. In a parallel line of work, the search within the online learning setting for an analogue of "localization" has led to a notion of an "offset" Rademacher process [17], yielding--in a rather clean manner--optimal rates for minimax regret in online supervised learning. It was also shown that the supremum of the offset process is a lower bound on the minimax value, thus establishing its intrinsic nature.

artificial intelligence, complexity, machine learning, (17 more...)

1502.06134

Country: Oceania > Australia (0.28)

Genre:

Research Report (0.40)
Instructional Material (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.48)

Jayadeva, null, Soman, Sumit, Bhaya, Amit

A Neurodynamical System for finding a Minimal VC Dimension Classifier

arXiv.org Machine LearningMar-10-2015

The recently proposed Minimal Complexity Machine (MCM) finds a hyperplane classifier by minimizing an exact bound on the Vapnik-Chervonenkis (VC) dimension. The VC dimension measures the capacity of a learning machine, and a smaller VC dimension leads to improved generalization. On many benchmark datasets, the MCM generalizes better than SVMs and uses far fewer support vectors than the number used by SVMs. In this paper, we describe a neural network based on a linear dynamical system, that converges to the MCM solution. The proposed MCM dynamical system is conducive to an analogue circuit implementation on a chip or simulation using Ordinary Differential Equation (ODE) solvers. Numerical experiments on benchmark datasets from the UCI repository show that the proposed approach is scalable and accurate, as we obtain improved accuracies and fewer number of support vectors (upto 74.3% reduction) with the MCM dynamical system.Keywords.

artificial intelligence, machine learning, optimization problem, (15 more...)

1503.03148

Country: South America > Brazil > Rio de Janeiro (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.72)