Oceania
Learning-Assisted Automated Reasoning with Flyspeck
Kaliszyk, Cezary, Urban, Josef
The considerable mathematical knowledge encoded by the Flyspeck project is combined with external automated theorem provers (ATPs) and machine-learning premise selection methods trained on the proofs, producing an AI system capable of answering a wide range of mathematical queries automatically. The performance of this architecture is evaluated in a bootstrapping scenario emulating the development of Flyspeck from axioms to the last theorem, each time using only the previous theorems and proofs. It is shown that 39% of the 14185 theorems could be proved in a push-button mode (without any high-level advice and user interaction) in 30 seconds of real time on a fourteen-CPU workstation. The necessary work involves: (i) an implementation of sound translations of the HOL Light logic to ATP formalisms: untyped first-order, polymorphic typed first-order, and typed higher-order, (ii) export of the dependency information from HOL Light and ATP proofs for the machine learners, and (iii) choice of suitable representations and methods for learning from previous proofs, and their integration as advisors with HOL Light. This work is described and discussed here, and an initial analysis of the body of proofs that were found fully automatically is provided.
Local Rademacher Complexity for Multi-label Learning
Xu, Chang, Liu, Tongliang, Tao, Dacheng, Xu, Chao
We analyze the local Rademacher complexity of empirical risk minimization (ERM)-based multi-label learning algorithms, and in doing so propose a new algorithm for multi-label learning. Rather than using the trace norm to regularize the multi-label predictor, we instead minimize the tail sum of the singular values of the predictor in multi-label learning. Benefiting from the use of the local Rademacher complexity, our algorithm, therefore, has a sharper generalization error bound and a faster convergence rate. Compared to methods that minimize over all singular values, concentrating on the tail singular values results in better recovery of the low-rank structure of the multi-label predictor, which plays an import role in exploiting label correlations. We propose a new conditional singular value thresholding algorithm to solve the resulting objective function. Empirical studies on real-world datasets validate our theoretical results and demonstrate the effectiveness of the proposed algorithm.
Scoring Functions Based on Second Level Score for k-SAT with Long Clauses
It is widely acknowledged that stochastic local search (SLS) algorithms can efficiently find models for satisfiable instances of the satisfiability (SAT) problem, especially for random k-SAT instances. However, compared to random 3-SAT instances where SLS algorithms have shown great success, random k-SAT instances with long clauses remain very difficult. Recently, the notion of second level score, denoted as "score_2", was proposed for improving SLS algorithms on long-clause SAT instances, and was first used in the powerful CCASat solver as a tie breaker. In this paper, we propose three new scoring functions based on score_2. Despite their simplicity, these functions are very effective for solving random k-SAT with long clauses. The first function combines score and score_2, and the second one additionally integrates the diversification property "age". These two functions are used in developing a new SLS algorithm called CScoreSAT. Experimental results on large random 5-SAT and 7-SAT instances near phase transition show that CScoreSAT significantly outperforms previous SLS solvers. However, CScoreSAT cannot rival its competitors on random k-SAT instances at phase transition. We improve CScoreSAT for such instances by another scoring function which combines score_2 with age. The resulting algorithm HScoreSAT exhibits state-of-the-art performance on random k-SAT (k>3) instances at phase transition. We also study the computation of score_2, including its implementation and computational complexity.
Generalized Conditional Gradient for Sparse Estimation
Yu, Yaoliang, Zhang, Xinhua, Schuurmans, Dale
Structured sparsity is an important modeling tool that expands the applicability of convex formulations for data analysis, however it also creates significant challenges for efficient algorithm design. In this paper we investigate the generalized conditional gradient (GCG) algorithm for solving structured sparse optimization problems---demonstrating that, with some enhancements, it can provide a more efficient alternative to current state of the art approaches. After providing a comprehensive overview of the convergence properties of GCG, we develop efficient methods for evaluating polar operators, a subroutine that is required in each GCG iteration. In particular, we show how the polar operator can be efficiently evaluated in two important scenarios: dictionary learning and structured sparse estimation. A further improvement is achieved by interleaving GCG with fixed-rank local subspace optimization. A series of experiments on matrix completion, multi-class classification, multi-view dictionary learning and overlapping group lasso shows that the proposed method can significantly reduce the training cost of current alternatives.
Efficient Modeling and Forecasting of the Electricity Spot Price
Ziel, Florian, Steinert, Rick, Husmann, Sven
The increasing importance of renewable energy, especially solar and wind power, has led to new forces in the formation of electricity prices. Hence, this paper introduces an econometric model for the hourly time series of electricity prices of the European Power Exchange (EPEX) which incorporates specific features like renewable energy. The model consists of several sophisticated and established approaches and can be regarded as a periodic VAR-TARCH with wind power, solar power, and load as influences on the time series. It is able to map the distinct and well-known features of electricity prices in Germany. An efficient iteratively reweighted lasso approach is used for the estimation. Moreover, it is shown that several existing models are outperformed by the procedure developed in this paper.
Sequential Monte Carlo for Graphical Models
Naesseth, Christian A., Lindsten, Fredrik, Schön, Thomas B.
We propose a new framework for how to use sequential Monte Carlo (SMC) algorithms for inference in probabilistic graphical models (PGM). Via a sequential decomposition of the PGM we find a sequence of auxiliary distributions defined on a monotonically increasing sequence of probability spaces. By targeting these auxiliary distributions using SMC we are able to approximate the full joint distribution defined by the PGM. One of the key merits of the SMC sampler is that it provides an unbiased estimate of the partition function of the model. We also show how it can be used within a particle Markov chain Monte Carlo framework in order to construct high-dimensional block-sampling algorithms for general PGMs.
Entrenchment-Based Horn Contraction
The AGM framework is the benchmark approach in belief change. Since the framework assumes an underlying logic containing classical Propositional Logic, it can not be applied to systems with a logic weaker than Propositional Logic. To remedy this limitation, several researchers have studied AGM-style contraction and revision under the Horn fragment of Propositional Logic (i.e., Horn logic). In this paper, we contribute to this line of research by investigating the Horn version of the AGM entrenchment-based contraction. The study is challenging as the construction of entrenchment-based contraction refers to arbitrary disjunctions which are not expressible under Horn logic. In order to adapt the construction to Horn logic, we make use of a Horn approximation technique called Horn strengthening. We provide a representation theorem for the newly constructed contraction which we refer to as entrenchment-based Horn contraction. Ideally, contractions defined under Horn logic (i.e., Horn contractions) should be as rational as AGM contraction. We propose the notion of Horn equivalence which intuitively captures the equivalence between Horn contraction and AGM contraction. We show that, under this notion, entrenchment-based Horn contraction is equivalent to a restricted form of entrenchment-based contraction.
On the Testability of BDI Agent Systems
Before deploying a software system we need to assure ourselves (and stakeholders) that the system will behave correctly. This assurance is usually done by testing the system. However, it is intuitively obvious that adaptive systems, including agent-based systems, can exhibit complex behaviour, and are thus harder to test. In this paper we examine this "obvious intuition" in the case of Belief-Desire-Intention (BDI) agents. We analyse the size of the behaviour space of BDI agents and show that although the intuition is correct, the factors that influence the size are not what we expected them to be. Specifically, we found that the introduction of failure handling had a much larger effect on the size of the behaviour space than we expected. We also discuss the implications of these findings on the testability of BDI agents.
Belief Tracking for Planning with Sensing: Width, Complexity and Approximations
We consider the problem of belief tracking in a planning setting where states are valuations over a set of variables that are partially observable, and beliefs stand for the sets of states that are possible. While the problem is intractable in the worst case, it has been recently shown that in deterministic conformant and contingent problems, belief tracking is exponential in a width parameter that is often bounded and small. In this work, we extend these results in two ways. First, we introduce a width notion that applies to non-deterministic problems as well, develop a factored belief tracking algorithm that is exponential in the problem width, and show how it applies to existing benchmarks. Second, we introduce a meaningful, powerful, and sound approximation scheme, beam tracking, that is exponential in a smaller parameter, the problem causal width, and has much broader applicability. We illustrate the value of this algorithm over large instances of problems such as Battleship, Minesweeper, and Wumpus, where it yields state-of-the-art performance in real-time.
MDD Propagation for Sequence Constraints
Bergman, D., Cire, A. A., van Hoeve, W.
We study propagation for the Sequence constraint in the context of constraint programming based on limited-width MDDs. Our first contribution is proving that establishing MDD-consistency for Sequence is NP-hard. Yet, we also show that this task is fixed parameter tractable with respect to the length of the sub-sequences. In addition, we propose a partial filtering algorithm that relies on a specific decomposition of the constraint and a novel extension of MDD filtering to node domains. We experimentally evaluate the performance of our proposed filtering algorithm, and demonstrate that the strength of the MDD propagation increases as the maximum width is increased. In particular, MDD propagation can outperform conventional domain propagation for Sequence by reducing the search tree size and solving time by several orders of magnitude. Similar improvements are observed with respect to the current best MDD approach that applies the decomposition of Sequence into Among constraints.