Mathematical & Statistical Methods
Solving Linear Algebra by Program Synthesis
We solve MIT's Linear Algebra 18.06 course and Columbia University's Computational Linear Algebra COMS3251 courses with perfect accuracy by interactive program synthesis. This surprisingly strong result is achieved by turning the course questions into programming tasks and then running the programs to produce the correct answers. We use OpenAI Codex with zero-shot learning, without providing any examples in the prompts, to synthesize code from questions. We quantify the difference between the original question text and the transformed question text that yields a correct answer. Since all COMS3251 questions are not available online the model is not overfitting. We go beyond just generating code for questions with numerical answers by interactively generating code that also results visually pleasing plots as output. Finally, we automatically generate new questions given a few sample questions which may be used as new course content. This work is a significant step forward in solving quantitative math problems and opens the door for solving many university level STEM courses by machine.
Stochastic Gradient Line Bayesian Optimization: Reducing Measurement Shots in Optimizing Parameterized Quantum Circuits
Tamiya, Shiro, Yamasaki, Hayata
Optimization of parameterized quantum circuits is indispensable for applications of near-term quantum devices to computational tasks with variational quantum algorithms (VQAs). However, the existing optimization algorithms for VQAs require an excessive number of quantum-measurement shots in estimating expectation values of observables or iterating updates of circuit parameters, whose cost has been a crucial obstacle for practical use. To address this problem, we develop an efficient framework, \textit{stochastic gradient line Bayesian optimization} (SGLBO), for the circuit optimization with fewer measurement shots. The SGLBO reduces the cost of measurement shots by estimating an appropriate direction of updating the parameters based on stochastic gradient descent (SGD) and further by utilizing Bayesian optimization (BO) to estimate the optimal step size in each iteration of the SGD. We formulate an adaptive measurement-shot strategy to achieve the optimization feasibly without relying on precise expectation-value estimation and many iterations; moreover, we show that a technique of suffix averaging can significantly reduce the effect of statistical and hardware noise in the optimization for the VQAs. Our numerical simulation demonstrates that the SGLBO augmented with these techniques can drastically reduce the required number of measurement shots, improve the accuracy in the optimization, and enhance the robustness against the noise compared to other state-of-art optimizers in representative tasks for the VQAs. These results establish a framework of quantum-circuit optimizers integrating two different optimization approaches, SGD and BO, to reduce the cost of measurement shots significantly.
Clustering of longitudinal data: A tutorial on a variety of approaches
Teuling, Niek Den, Pauws, Steffen, Heuvel, Edwin van den
During the past two decades, methods for identifying groups with different trends in longitudinal data have become of increasing interest across many areas of research. To support researchers, we summarize the guidance from the literature regarding longitudinal clustering. Moreover, we present a selection of methods for longitudinal clustering, including group-based trajectory modeling (GBTM), growth mixture modeling (GMM), and longitudinal k-means (KML). The methods are introduced at a basic level, and strengths, limitations, and model extensions are listed. Following the recent developments in data collection, attention is given to the applicability of these methods to intensive longitudinal data (ILD). We demonstrate the application of the methods on a synthetic dataset using packages available in R.
Robust Estimation for Random Graphs
Acharya, Jayadev, Jain, Ayush, Kamath, Gautam, Suresh, Ananda Theertha, Zhang, Huanyu
Finding underlying patterns and structure in data is a central task in machine learning and statistics. Typically, such structures are induced by modelling assumptions on the data generating procedure. While they offer mathematical convenience, real data generally does not match with these idealized models, for reasons ranging from model misspecification to adversarial data poisoning. Thus for learning algorithms to be effective in the wild, we require methods that are robust to deviations from the assumed model. With this motivation, we initiate the study of robust estimation for random graph models. Specifically, we will be concerned with the Erdลs-Rรฉnyi (ER) random graph model [Gil59, ER59].
A Comparison of Model-Free and Model Predictive Control for Price Responsive Water Heaters
Biagioni, David J., Zhang, Xiangyu, Graf, Peter, Sigler, Devon, Jones, Wesley
We present a careful comparison of two model-free control algorithms, Evolution Strategies (ES) and Proximal Policy Optimization (PPO), with receding horizon model predictive control (MPC) for operating simulated, price responsive water heaters. Four MPC variants are considered: a one-shot controller with perfect forecasting yielding optimal control; a limited-horizon controller with perfect forecasting; a mean forecasting-based controller; and a two-stage stochastic programming controller using historical scenarios. In all cases, the MPC model for water temperature and electricity price are exact; only water demand is uncertain. For comparison, both ES and PPO learn neural network-based policies by directly interacting with the simulated environment under the same scenarios used by MPC. All methods are then evaluated on a separate one-week continuation of the demand time series. We demonstrate that optimal control for this problem is challenging, requiring more than 8-hour lookahead for MPC with perfect forecasting to attain the minimum cost. Despite this challenge, both ES and PPO learn good general purpose policies that outperform mean forecast and two-stage stochastic MPC controllers in terms of average cost and are more than two orders of magnitude faster at computing actions. We show that ES in particular can leverage parallelism to learn a policy in under 90 seconds using 1150 CPU cores.
Modelling and Optimisation of Resource Usage in an IoT Enabled Smart Campus
University campuses are essentially a microcosm of a city. They comprise diverse facilities such as residences, sport centres, lecture theatres, parking spaces, and public transport stops. Universities are under constant pressure to improve efficiencies while offering a better experience to various stakeholders including students, staff, and visitors. Nonetheless, anecdotal evidence indicates that campus assets are not being utilised efficiently, often due to the lack of data collection and analysis, thereby limiting the ability to make informed decisions on the allocation and management of resources. Advances in the Internet of Things (IoT) technologies that can sense and communicate data from the physical world, coupled with data analytics and Artificial intelligence (AI) that can predict usage patterns, have opened up new opportunities for organisations to lower cost and improve user experience. This thesis explores this opportunity via theory and experimentation using UNSW Sydney as a living laboratory.
Linear Algebra Beginner - Expert, Plus Data Science Practice
In this course, we look at core Linear Algebra concepts and how it can be used in solving real world problems. We shall go through core Linear Algebra topics like Matrices, Vectors and Vector Spaces. If you are interested in learning the mathematical concepts in linear algebra, but also want to apply those concepts to datascience, statistics, finance, engineering, etc.then this course is for you! We shall explain detaily all Maths Concepts and also implement them programmaticaly in Python. We lay much emphasis on feedback.
Geodesic statistics for random network families
A key task in the study of networked systems is to derive local and global properties that impact connectivity, synchronizability, and robustness. Computing shortest paths or geodesics in the network yields measures of node centrality and network connectivity that can contribute to explain such phenomena. We derive an analytic distribution of shortest path lengths, on the giant component in the supercritical regime or on small components in the subcritical regime, of any sparse (possibly directed) graph with conditionally independent edges, in the infinite-size limit. We provide specific results for widely used network families like stochastic block models, dot-product graphs, random geometric graphs, and graphons. The survival function of the shortest path length distribution possesses a simple closed-form lower bound which is asymptotically tight for finite lengths, has a natural interpretation of traversing independent geodesics in the network, and delivers novel insight in the above network families. Notably, the shortest path length distribution allows us to derive, for the network families above, important graph properties like the bond percolation threshold, size of the giant component, average shortest path length, and closeness and betweenness centralities. We also provide a corroborative analysis of a set of 20 empirical networks. This unifying framework demonstrates how geodesic statistics for a rich family of random graphs can be computed cheaply without having access to true or simulated networks, especially when they are sparse but prohibitively large.
Linear Algebra To Know For Machine Learning
Linear Algebra deals with linear equations like linear maps (which is a mapping of two different vector spaces which preserve the vector operation of addition and scalar multiplication) and its representations in vector spaces and through matrices. Linear algebra is key in almost all areas of mathematics since it is widely used in science and many fields of engineering as it helps model different natural phenomena and compute them efficiently. Linear Algebra is highly similar to the Algebra we talked about back in our previous article, except that instead of ordinary single numbers, it deals with vectors. Many of the same algebraic operations we've used to perform on ordinary numbers (i.e., scalars), like addition, subtraction and multiplication, can be generalized to be operated on vectors.
Vector Calculus for Machine Learning
To keep this post as engaging and entertaining as possible I will first introduce a brief history of Calculus and why I think it is so cool. Then, we will move on to reviewing fundamental concepts of your high school calculus such as derivative rules. Next, we will get our feet wet with vectors and matrices to make sure you are comfortable with these mathematical objects before covering partial and vector derivatives. Finally, I will conclude this post with the concept of a gradient, the intuition behind optimization with Gradient Descent and a cool implementation of calculus with Python leveraging the library SimPy. Feel free to skip any sections you like if you are comfortable with such topics. At the core, Calculus is just a very special way of thinking about large problems by splitting them into several, smaller, problems.