Country
Regularity of Position Sequences
A person is given a numbered sequence of positions on a sheet of paper. The person is asked, "Which will be the next (or the next after that) position?" Everyone has an opinion as to how he or she would proceed. There are regular sequences for which there is general agreement on how to continue. However, there are less regular sequences for which this assessment is less certain. There are sequences for which every continuation is perceived to be arbitrary. I would like to present a mathematical model that reflects these opinions and perceptions with the aid of a valuation function. It is necessary to apply a rich set of invariant features of position sequences to ensure the quality of this model. All other properties of the model are arbitrary.
A Block Lanczos with Warm Start Technique for Accelerating Nuclear Norm Minimization Algorithms
Recent years have witnessed the popularity of using rank minimization as a regularizer for various signal processing and machine learning problems. As rank minimization problems are often converted to nuclear norm minimization (NNM) problems, they have to be solved iteratively and each iteration requires computing a singular value decomposition (SVD). Therefore, their solution suffers from the high computation cost of multiple SVDs. To relieve this issue, we propose using the block Lanczos method to compute the partial SVDs, where the principal singular subspaces obtained in the previous iteration are used to start the block Lanczos procedure. To avoid the expensive reorthogonalization in the Lanczos procedure, the block Lanczos procedure is performed for only a few steps. Our block Lanczos with warm start (BLWS) technique can be adopted by different algorithms that solve NNM problems. We present numerical results on applying BLWS to Robust PCA and Matrix Completion problems. Experimental results show that our BLWS technique usually accelerates its host algorithm by at least two to three times.
Mining Multi-Level Frequent Itemsets under Constraints
Gouider, Mohamed Salah, Farhat, Amine
Mining association rules is a task of data mining, which extracts knowledge in the form of significant implication relation of useful items (objects) from a database. Mining multilevel association rules uses concept hierarchies, also called taxonomies and defined as relations of type 'is-a' between objects, to extract rules that items belong to different levels of abstraction. These rules are more useful, more refined and more interpretable by the user. Several algorithms have been proposed in the literature to discover the multilevel association rules. In this article, we are interested in the problem of discovering multi-level frequent itemsets under constraints, involving the user in the research process. We proposed a technique for modeling and interpretation of constraints in a context of use of concept hierarchies. Three approaches for discovering multi-level frequent itemsets under constraints were proposed and discussed: Basic approach, "Test and Generate" approach and Pruning based Approach.
Ontology-based Queries over Cancer Data
Gonzalez-Beltran, Alejandra, Tagger, Ben, Finkelstein, Anthony
The ever-increasing amount of data in biomedical research, and in cancer research in particular, needs to be managed to support efficient data access, exchange and integration. Existing software infrastructures, such caGrid, support access to distributed information annotated with a domain ontology. However, caGrid's current querying functionality depends on the structure of individual data resources without exploiting the semantic annotations. In this paper, we present the design and development of an ontology-based querying functionality that consists of: the generation of OWL2 ontologies from the underlying data resources metadata and a query rewriting and translation process based on reasoning, which converts a query at the domain ontology level into queries at the software infrastructure level. We present a detailed analysis of our approach as well as an extensive performance evaluation. While the implementation and evaluation was performed for the caGrid infrastructure, the approach could be applicable to other model and metadata-driven environments for data sharing.
Regularized Least-Mean-Square Algorithms
Chen, Yilun, Gu, Yuantao, Hero, Alfred O.
We consider adaptive system identification problems with convex constraints and propose a family of regularized Least-Mean-Square (LMS) algorithms. We show that with a properly selected regularization parameter the regularized LMS provably dominates its conventional counterpart in terms of mean square deviations. We establish simple and closed-form expressions for choosing this regularization parameter. For identifying an unknown sparse system we propose sparse and group-sparse LMS algorithms, which are special examples of the regularized LMS family. Simulation results demonstrate the advantages of the proposed filters in both convergence rate and steady-state error under sparsity assumptions on the true coefficient vector.
Split Bregman Method for Sparse Inverse Covariance Estimation with Matrix Iteration Acceleration
Ye, Gui-Bo, Cai, Jian-Feng, Xie, Xiaohui
We consider the problem of estimating the inverse covariance matrix by maximizing the likelihood function with a penalty added to encourage the sparsity of the resulting matrix. We propose a new approach based on the split Bregman method to solve the regularized maximum likelihood estimation problem. We show that our method is significantly faster than the widely used graphical lasso method, which is based on blockwise coordinate descent, on both artificial and real-world data. More importantly, different from the graphical lasso, the split Bregman based method is much more general, and can be applied to a class of regularization terms other than the $\ell_1$ norm
Automatic Estimation of the Exposure to Lateral Collision in Signalized Intersections using Video Sensors
Saunier, Nicolas, Midenet, Sophie
Intersections constitute one of the most dangerous elements in road systems. Traffic signals remain the most common way to control traffic at high-volume intersections and offer many opportunities to apply intelligent transportation systems to make traffic more efficient and safe. This paper describes an automated method to estimate the temporal exposure of road users crossing the conflict zone to lateral collision with road users originating from a different approach. This component is part of a larger system relying on video sensors to provide queue lengths and spatial occupancy that are used for real time traffic control and monitoring. The method is evaluated on data collected during a real world experiment.
A GMBCG Galaxy Cluster Catalog of 55,424 Rich Clusters from SDSS DR7
Hao, Jiangang, McKay, Timothy A., Koester, Benjamin P., Rykoff, Eli S., Rozo, Eduardo, Annis, James, Wechsler, Risa H., Evrard, August, Siegel, Seth R., Becker, Matthew, Busha, Michael, Gerdes, David, Johnston, David E., Sheldon, Erin
We present a large catalog of optically selected galaxy clusters from the application of a new Gaussian Mixture Brightest Cluster Galaxy (GMBCG) algorithm to SDSS Data Release 7 data. The algorithm detects clusters by identifying the red sequence plus Brightest Cluster Galaxy (BCG) feature, which is unique for galaxy clusters and does not exist among field galaxies. Red sequence clustering in color space is detected using an Error Corrected Gaussian Mixture Model. We run GMBCG on 8240 square degrees of photometric data from SDSS DR7 to assemble the largest ever optical galaxy cluster catalog, consisting of over 55,000 rich clusters across the redshift range from 0.1 < z < 0.55. We present Monte Carlo tests of completeness and purity and perform cross-matching with X-ray clusters and with the maxBCG sample at low redshift. These tests indicate high completeness and purity across the full redshift range for clusters with 15 or more members.
Interpolation in Equilibrium Logic and Answer Set Programming: the Propositional Case
Gabbay, Dov, Pearce, David, Valverde, Agustí n
Interpolation is an important property of classical and many non classical logics that has been shown to have interesting applications in computer science and AI. Here we study the Interpolation Property for the propositional version of the non-monotonic system of equilibrium logic, establishing weaker or stronger forms of interpolation depending on the precise interpretation of the inference relation. These results also yield a form of interpolation for ground logic programs under the answer sets semantics. For disjunctive logic programs we also study the property of uniform interpolation that is closely related to the concept of variable forgetting.
Artificial Intelligence in Reverse Supply Chain Management: The State of the Art
Xing, Bo, Gao, Wen-Jing, Battle, Kimberly, Marwala, Tshildzi, Nelwamondo, Fulufhelo V.
Product take-back legislation forces manufacturers to bear the costs of collection and disposal of products that have reached the end of their useful lives. In order to reduce these costs, manufacturers can consider reuse, remanufacturing and/or recycling of components as an alternative to disposal. The implementation of such alternatives usually requires an appropriate reverse supply chain management. With the concepts of reverse supply chain are gaining popularity in practice, the use of artificial intelligence approaches in these areas is also becoming popular. As a result, the purpose of this paper is to give an overview of the recent publications concerning the application of artificial intelligence techniques to reverse supply chain with emphasis on certain types of product returns.