AITopics

2106.07644

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Asia > Middle East > Jordan (0.04)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.85)

Even, Mathieu, Hendrikx, Hadrien, Massoulie, Laurent

Decentralized Optimization with Heterogeneous Delays: a Continuous-Time Approach

arXiv.org Machine LearningJun-7-2021

In decentralized optimization, nodes of a communication network privately possess a local objective function, and communicate using gossip-based methods in order to minimize the average of these per-node objectives. While synchronous algorithms can be heavily slowed down by a few nodes and edges in the graph (the straggler problem), their asynchronous counterparts lack from a sharp analysis taking into account heterogeneous delays in the communication network. In this paper, we propose a novel continuous-time framework to analyze asynchronous algorithms, which does not require to define a global ordering of the events, and allows to finely characterize the time complexity in the presence of (heterogeneous) delays. Using this framework, we describe a fully asynchronous decentralized algorithm to minimize the sum of smooth and strongly convex functions. Our algorithm (DCDM, Delayed Coordinate Dual Method), based on delayed randomized gossip communications and local computational updates, achieves an asynchronous speed-up: the rate of convergence is tightly characterized in terms of the eigengap of the graph weighted by local delays only, instead of the global worst-case delays as in previous analyses.

algorithm, comp, graph, (16 more...)

2106.03585

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Communications > Networks (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.46)

Tjøstheim, Dag, Jullum, Martin, Løland, Anders

Statistical embedding: Beyond principal components

arXiv.org Machine LearningJun-3-2021

There has been an intense recent activity in embedding of very high dimensional and nonlinear data structures, much of it in the data science and machine learning literature. We survey this activity in four parts. In the first part we cover nonlinear methods such as principal curves, multidimensional scaling, local linear methods, ISOMAP, graph based methods and kernel based methods. The second part is concerned with topological embedding methods, in particular mapping topological properties into persistence diagrams. Another type of data sets with a tremendous growth is very high-dimensional network data. The task considered in part three is how to embed such data in a vector space of moderate dimension to make the data amenable to traditional techniques such as cluster and classification techniques. The final part of the survey deals with embedding in $\mathbb{R}^2$, which is visualization. Three methods are presented: $t$-SNE, UMAP and LargeVis based on methods in parts one, two and three, respectively. The methods are illustrated and compared on two simulated data sets; one consisting of a triple of noisy Ranunculoid curves, and one consisting of networks of increasing complexity and with two types of nodes.

artificial intelligence, machine learning, statistical embedding, (17 more...)

2106.01858

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York (0.04)
North America > United States > Nevada (0.04)
(7 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (0.87)
Information Technology > Networks (0.34)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(5 more...)

Kümmerle, Christian, Verdun, Claudio Mayrink

A Scalable Second Order Method for Ill-Conditioned Matrix Completion from Few Samples

arXiv.org Machine LearningJun-3-2021

We propose an iterative algorithm for low-rank matrix completion that can be interpreted as an iteratively reweighted least squares (IRLS) algorithm, a saddle-escaping smoothing Newton method or a variable metric proximal gradient method applied to a non-convex rank surrogate. It combines the favorable data-efficiency of previous IRLS approaches with an improved scalability by several orders of magnitude. We establish the first local convergence guarantee from a minimal number of samples for that class of algorithms, showing that the method attains a local quadratic convergence rate. Furthermore, we show that the linear systems to be solved are well-conditioned even for very ill-conditioned ground truth matrices. We provide extensive experiments, indicating that unlike many state-of-the-art approaches, our method is able to complete very ill-conditioned matrices with a condition number of up to $10^{10}$ from few samples, while being competitive in its scalability.

algorithm, matrix, matrixirls, (13 more...)

2106.02119

Country:

North America > United States > New York > New York County > New York City (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Software > Programming Languages (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.48)

#artificialintelligenceJun-2-2021, 11:57:47 GMT

Essential Linear Algebra for Data Science and Machine Learning - KDnuggets

Linear Algebra is a branch of mathematics that is extremely useful in data science and machine learning. Linear algebra is the most important math skill in machine learning. Most machine learning models can be expressed in matrix form. A dataset itself is often represented as a matrix. Linear algebra is used in data preprocessing, data transformation, and model evaluation.

covariance matrix, data science, matrix, (8 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.31)

#artificialintelligenceMay-31-2021, 22:50:19 GMT

LASSO regression using tidymodels and #TidyTuesday data for The Office

Our modeling goal here is to predict the IMDB ratings for episodes of The Office based on the other characteristics of the episodes in the #TidyTuesday dataset. There are two datasets, one with the ratings and one with information like director, writer, and which character spoke which line. The episode numbers and titles are not consistent between them, so we can use regular expressions to do a better job of matching the datasets up for joining. We are going to use several different kinds of features for modeling. First, let's find out how many times characters speak per episode.

lasso regression, modeling, tidytuesday data, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.44)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.44)

#artificialintelligenceMay-30-2021, 17:18:02 GMT

Math for Data Science and Machine Learning: University Level

In this course, we will learn math for data science and machine learning. We will also discuss the importance of Math for data science and machine learning in practical words. Moreover, Math for data science and machine learning course is a bundle of two courses of linear algebra and probability and statistics. So, students will learn the complete contents of probability and statistics, and linear algebra. It is not like that you will not complete all the contents in this 7 hours video course.

instructor, probability and statistic, student, (8 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry: Education > Curriculum > Subject-Specific Education (0.36)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

#artificialintelligenceMay-30-2021, 14:16:23 GMT

Graph Theory Algorithms

My name is William, and I'm super excited to bring to you this a video series focused on graph theory. Graph theory is one of my absolute favorite topics in computer science. We're going to see a lot of very awesome algorithms. The whole field is very diverse and hugely applicable to real world applications. I think everybody should be able to learn, love and enjoy graph theory.

graph, graph theory, node, (13 more...)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (1.00)

Haben, Stephen, Arora, Siddharth, Giasemidis, Georgios, Voss, Marcus, Greetham, Danica Vukadinovic

Review of Low-Voltage Load Forecasting: Methods, Applications, and Recommendations

arXiv.org Machine LearningMay-30-2021

The increased digitalisation and monitoring of the energy system opens up numerous opportunities % and solutions which can help to decarbonise the energy system. Applications on low voltage (LV), localised networks, such as community energy markets and smart storage will facilitate decarbonisation, but they will require advanced control and management. Reliable forecasting will be a necessary component of many of these systems to anticipate key features and uncertainties. Despite this urgent need, there has not yet been an extensive investigation into the current state-of-the-art of low voltage level forecasts, other than at the smart meter level. This paper aims to provide a comprehensive overview of the landscape, current approaches, core applications, challenges and recommendations. Another aim of this paper is to facilitate the continued improvement and advancement in this area. To this end, the paper also surveys some of the most relevant and promising trends. It establishes an open, community-driven list of the known LV level open datasets to encourage further research and development.

forecast, renewable energy, upstream oil & gas, (28 more...)

2106.00006

Country:

North America > United States (1.00)
Asia > China (0.67)
Europe > Germany (0.46)
(14 more...)

Genre:

Overview (1.00)
Research Report > Experimental Study (0.92)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)
Energy > Energy Storage (0.93)
(2 more...)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Quality (1.00)
Information Technology > Data Science > Data Mining (1.00)
(15 more...)

arXiv.org Machine LearningMay-29-2021

Understanding Bandits with Graph Feedback

Chen, Houshuang, Huang, Zengfeng, Li, Shuai, Zhang, Chihao

The bandit problem with graph feedback, proposed in [Mannor and Shamir, NeurIPS 2011], is modeled by a directed graph $G=(V,E)$ where $V$ is the collection of bandit arms, and once an arm is triggered, all its incident arms are observed. A fundamental question is how the structure of the graph affects the min-max regret. We propose the notions of the fractional weak domination number $\delta^*$ and the $k$-packing independence number capturing upper bound and lower bound for the regret respectively. We show that the two notions are inherently connected via aligning them with the linear program of the weakly dominating set and its dual -- the fractional vertex packing set respectively. Based on this connection, we utilize the strong duality theorem to prove a general regret upper bound $O\left(\left( \delta^*\log |V|\right)^{\frac{1}{3}}T^{\frac{2}{3}}\right)$ and a lower bound $\Omega\left(\left(\delta^*/\alpha\right)^{\frac{1}{3}}T^{\frac{2}{3}}\right)$ where $\alpha$ is the integrality gap of the dual linear program. Therefore, our bounds are tight up to a $\left(\log |V|\right)^{\frac{1}{3}}$ factor on graphs with bounded integrality gap for the vertex packing problem including trees and graphs with bounded degree. Moreover, we show that for several special families of graphs, we can get rid of the $\left(\log |V|\right)^{\frac{1}{3}}$ factor and establish optimal regret.

algorithm, graph, log null, (17 more...)

2105.1426

Country:

Asia > China > Shanghai > Shanghai (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.87)
Information Technology > Data Science > Data Mining > Big Data (0.67)