AITopics | Mathematical & Statistical Methods

Collaborating Authors

Mathematical & Statistical Methods

News Overviews Instructional Materials AI-Alerts Classics

Linear-Time Algorithm for Learning Large-Scale Sparse Graphical Models

Zhang, Richard Y., Fattahi, Salar, Sojoudi, Somayeh

arXiv.org Machine LearningFeb-20-2018

The sparse inverse covariance estimation problem is commonly solved using an $\ell_{1}$-regularized Gaussian maximum likelihood estimator known as "graphical lasso", but its computational cost becomes prohibitive for large data sets. A recent line of results showed--under mild assumptions--that the graphical lasso estimator can be retrieved by soft-thresholding the sample covariance matrix and solving a maximum determinant matrix completion (MDMC) problem. This paper proves an extension of this result, and describes a Newton-CG algorithm to efficiently solve the MDMC problem. Assuming that the thresholded sample covariance matrix is sparse with a sparse Cholesky factorization, we prove that the algorithm converges to an $\epsilon$-accurate solution in $O(n\log(1/\epsilon))$ time and $O(n)$ memory. The algorithm is highly efficient in practice: we solve the associated MDMC problems with as many as 200,000 variables to 7-9 digits of accuracy in less than an hour on a standard laptop computer running MATLAB.

artificial intelligence, bayesian inference, machine learning, (17 more...)

arXiv.org Machine Learning

1802.04911

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

Journal of Biometrics and Biostatistics - Open Access Journals

#artificialintelligenceFeb-19-2018, 12:34:15 GMT

Biometrics and Biostatistics are disciplines of biological sciences concerned with the application of mathematical-statistical theory, principles, and practices to the observation, measurement, and analysis of biological data and phenomena. Journal of Biometrics and Biostatistics is a leading peer reviewed journal, promoting open access publishing in the collection of major scientific journals available in the scientific society. This promotes the application of statistical methods to the solution of biological problems. Journal of Biometrics and Biostatistics is a academic journal and aims to publish most complete and reliable source of information on the discoveries and current developments in the mode of original articles, review articles, case reports, short communications, etc. in all areas related to Biometrics, Medical statistics and making them freely available through online without any restrictions or any other subscriptions to researchers worldwide. It is an online manuscript submission, review and managing systems.

artificial intelligence, biostatistic, related journal, (12 more...)

#artificialintelligence

Genre: Overview (0.55)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Information Technology > Security & Privacy (0.72)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Add feedback

Tools for higher-order network analysis

Benson, Austin R.

arXiv.org Machine LearningFeb-19-2018

Networks are a fundamental model of complex systems throughout the sciences, and network datasets are typically analyzed through lower-order connectivity patterns described at the level of individual nodes and edges. However, higher-order connectivity patterns captured by small subgraphs, also called network motifs, describe the fundamental structures that control and mediate the behavior of many complex systems. We develop three tools for network analysis that use higher-order connectivity patterns to gain new insights into network datasets: (1) a framework to cluster nodes into modules based on joint participation in network motifs; (2) a generalization of the clustering coefficient measurement to investigate higher-order closure patterns; and (3) a definition of network motifs for temporal networks and fast algorithms for counting them. Using these tools, we analyze data from biology, ecology, economics, neuroscience, online social networks, scientific collaborations, telecommunications, transportation, and the World Wide Web.

data mining, machine learning, neural information processing system, (19 more...)

arXiv.org Machine Learning

1802.0682

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.14)
Africa > Senegal > Kolda Region > Kolda (0.04)
Europe > Middle East > Cyprus > Nicosia > Nicosia (0.04)
(12 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.92)
Research Report > Experimental Study (0.67)

Industry:

Transportation (1.00)
Telecommunications (1.00)
Information Technology > Services (1.00)
(3 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
(2 more...)

Add feedback

Global Convergence of Langevin Dynamics Based Algorithms for Nonconvex Optimization

Xu, Pan, Chen, Jinghui, Zou, Difan, Gu, Quanquan

arXiv.org Machine LearningFeb-19-2018

We present a unified framework to analyze the global convergence of Langevin dynamics based algorithms for nonconvex finite-sum optimization with $n$ component functions. At the core of our analysis is a direct analysis of the ergodicity of the numerical approximations to Langevin dynamics, which leads to faster convergence rates. Specifically, we show that gradient Langevin dynamics (GLD) and stochastic gradient Langevin dynamics (SGLD) converge to the almost minimizer within $\tilde O\big(nd/(\lambda\epsilon) \big)$ and $\tilde O\big(d^7/(\lambda^5\epsilon^5) \big)$ stochastic gradient evaluations respectively, where $d$ is the problem dimension, and $\lambda$ is the spectral gap of the Markov chain generated by GLD. Both of the results improve upon the best known gradient complexity results. Furthermore, for the first time we prove the global convergence guarantee for variance reduced stochastic gradient Langevin dynamics (VR-SGLD) to the almost minimizer after $\tilde O\big(\sqrt{n}d^5/(\lambda^4\epsilon^{5/2})\big)$ stochastic gradient evaluations, which outperforms the gradient complexities of GLD and SGLD in a wide regime. Our theoretical analyses shed some light on using Langevin dynamics based algorithms for nonconvex optimization with provable guarantees.

artificial intelligence, langevin dynamic, machine learning, (16 more...)

arXiv.org Machine Learning

1707.06618

Country:

North America > United States > Virginia > Albemarle County > Charlottesville (0.14)
Asia > Middle East > Jordan (0.04)
Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.35)

Add feedback

Optimizing Spectral Sums using Randomized Chebyshev Expansions

Han, Insu, Avron, Haim, Shin, Jinwoo

arXiv.org Machine LearningFeb-18-2018

The trace of matrix functions, often called spectral sums, e.g., rank, log-determinant and nuclear norm, appear in many machine learning tasks. However, optimizing or computing such (parameterized) spectral sums typically involves the matrix decomposition at the cost cubic in the matrix dimension, which is expensive for large-scale applications. Several recent works were proposed to approximate large-scale spectral sums utilizing polynomial function approximations and stochastic trace estimators. However, all prior works on this line have studied biased estimators, and their direct adaptions to an optimization task under stochastic gradient descent (SGD) frameworks often do not work as accumulated biased errors prevent stable convergence to the optimum. To address the issue, we propose the provable optimal unbiased estimator by randomizing Chebyshev polynomial degrees. We further introduce two additional techniques for accelerating SGD, where key ideas are on sharing randomness among many estimations during the iterative procedure. Finally, we showcase two applications of the proposed SGD schemes: matrix completion and learning Gaussian process, under the real-world datasets.

artificial intelligence, machine learning, spectral sum, (18 more...)

arXiv.org Machine Learning

1802.06355

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)

Add feedback

Newton-Type Methods for Non-Convex Optimization Under Inexact Hessian Information

Xu, Peng, Roosta-Khorasani, Farbod, Mahoney, Michael W.

arXiv.org Machine LearningFeb-15-2018

We consider variants of trust-region and cubic regularization methods for non-convex optimization, in which the Hessian matrix is approximated. Under mild conditions on the inexact Hessian, and using approximate solution of the corresponding sub-problems, we provide iteration complexity to achieve $ \epsilon $-approximate second-order optimality which have shown to be tight. Our Hessian approximation conditions constitute a major relaxation over the existing ones in the literature. Consequently, we are able to show that such mild conditions allow for the construction of the approximate Hessian through various random sampling methods. In this light, we consider the canonical problem of finite-sum minimization, provide appropriate uniform and non-uniform sub-sampling strategies to construct such Hessian approximations, and obtain optimal iteration complexity for the corresponding sub-sampled trust-region and cubic regularization methods.

arxiv preprint arxiv, data mining, machine learning, (19 more...)

arXiv.org Machine Learning

1708.07164

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Pennsylvania > Northampton County > Bethlehem (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Mathematics of Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Quantum machine learning: a classical perspective

Ciliberto, Carlo, Herbster, Mark, Ialongo, Alessandro Davide, Pontil, Massimiliano, Rocchetto, Andrea, Severini, Simone, Wossnig, Leonard

arXiv.org Machine LearningFeb-13-2018

Recently, increased computational power and data availability, as well as algorithmic advances, have led machine learning techniques to impressive results in regression, classification, data-generation and reinforcement learning tasks. Despite these successes, the proximity to the physical limits of chip fabrication alongside the increasing size of datasets are motivating a growing number of researchers to explore the possibility of harnessing the power of quantum computation to speed-up classical machine learning algorithms. Here we review the literature in quantum machine learning and discuss perspectives for a mixed readership of classical machine learning and quantum computation experts. Particular emphasis will be placed on clarifying the limitations of quantum algorithms, how they compare with their best classical counterparts and why quantum resources are expected to provide advantages for learning problems. Learning in the presence of noise and certain computationally hard problems in machine learning are identified as promising directions for the field. Practical questions, like how to upload classical data into quantum form, will also be addressed.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Machine Learning

doi: 10.1098/rspa.2017.0551

1707.08561

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
(10 more...)

Genre:

Research Report (1.00)
Overview (0.88)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
(3 more...)

Add feedback

A Simple Proximal Stochastic Gradient Method for Nonsmooth Nonconvex Optimization

Li, Zhize, Li, Jian

arXiv.org Machine LearningFeb-13-2018

We analyze stochastic gradient algorithms for optimizing nonconvex, nonsmooth finite-sum problems. In particular, the objective function is given by the summation of a differentiable (possibly nonconvex) component, together with a possibly non-differentiable but convex component. We propose a proximal stochastic gradient algorithm based on variance reduction, called ProxSVRG+. The algorithm is a slight variant of the ProxSVRG algorithm [Reddi et al., 2016b]. Our main contribution lies in the analysis of ProxSVRG+. It recovers several existing convergence results (in terms of the number of stochastic gradient oracle calls and proximal operations), and improves/generalizes some others. In particular, ProxSVRG+ generalizes the best results given by the SCSG algorithm, recently proposed by [Lei et al., 2017] for the smooth nonconvex case. ProxSVRG+ is more straightforward than SCSG and yields simpler analysis. Moreover, ProxSVRG+ outperforms the deterministic proximal gradient descent (ProxGD) for a wide range of minibatch sizes, which partially solves an open problem proposed in [Reddi et al., 2016b]. Finally, for nonconvex functions satisfied Polyak-{\L}ojasiewicz condition, we show that ProxSVRG+ achieves global linear convergence rate without restart. ProxSVRG+ is always no worse than ProxGD and ProxSVRG/SAGA, and sometimes outperforms them (and generalizes the results of SCSG) in this case.

artificial intelligence, machine learning, proxsvrg, (18 more...)

arXiv.org Machine Learning

1802.04477

Country: Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Introduction to Matrix Types in Linear Algebra for Machine Learning - Machine Learning Mastery

@machinelearnbotFeb-9-2018, 08:31:48 GMT

To be symmetric, the axis of symmetry is always the main diagonal of the matrix, from the top left to the bottom right. Below is an example of a 5 5 symmetric matrix. A symmetric matrix is always square and equal to its own transpose. A triangular matrix is a type of square matrix that has all values in the upper-right or lower-left of the matrix with the remaining elements filled with zero values. A triangular matrix with values only above the main diagonal is called an upper triangular matrix.

artificial intelligence, machine learning, matrix, (13 more...)

@machinelearnbot

Genre: Instructional Material > Course Syllabus & Notes (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.47)

Add feedback

A Sinkhorn-Newton method for entropic optimal transport

Brauer, Christoph, Clason, Christian, Lorenz, Dirk, Wirth, Benedikt

arXiv.org Machine LearningFeb-9-2018

The mathematical problem of optimal mass transport has a long history dating back to its introduction in Monge [10], with key contributions by Kantorovivc [6] and Kantorovivc & Rubinvsteuin [7]. It has recently received increased interest due to numerous applications in machine learning; see, e.g., the recent overview of Kolouri, Park, Thorpe, Slepcev & Rohde [9] and the references therein. In a nutshell, the (discrete) problem of optimal transport in its Kantorovich form is to compute for given mass distributions a and b with equal mass a transport plan, i.e., an assignment of how much mass of a at some point should be moved to another point to match the mass in b. This should be done in a way such that some transport cost (usually proportional to the amount of mass and dependent on the distance) is minimized. This leads to a linear optimization problem which has been well studied, but its application in machine learning has been problematic due to large memory requirement and long run time.

artificial intelligence, iteration, machine learning, (17 more...)

arXiv.org Machine Learning

1710.06635

Country:

Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Europe > Germany > North Rhine-Westphalia > Münster Region > Münster (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.69)
Information Technology > Hardware > Memory (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.52)

Add feedback