Goto

Collaborating Authors

 Optimization


Training Electric Vehicle Charging Controllers with Imitation Learning

arXiv.org Artificial Intelligence

The problem of coordinating the charging of electric vehicles gains more importance as the number of such vehicles grows. In this paper, we develop a method for the training of controllers for the coordination of EV charging. In contrast to most existing works on this topic, we require the controllers to preserve the privacy of the users, therefore we do not allow any communication from the controller to any third party. In order to train the controllers, we use the idea of imitation learning -- we first find an optimum solution for a relaxed version of the problem using quadratic optimization and then train the controllers to imitate this solution. We also investigate the effects of regularization of the optimum solution on the performance of the controllers. The method is evaluated on realistic data and shows improved performance and training speed compared to similar controllers trained using evolutionary algorithms.


Optimal Operation of Power Systems with Energy Storage under Uncertainty: A Scenario-based Method with Strategic Sampling

arXiv.org Artificial Intelligence

The multi-period dynamics of energy storage (ES), intermittent renewable generation and uncontrollable power loads, make the optimization of power system operation (PSO) challenging. A multi-period optimal PSO under uncertainty is formulated using the chance-constrained optimization (CCO) modeling paradigm, where the constraints include the nonlinear energy storage and AC power flow models. Based on the emerging scenario optimization method which does not rely on pre-known probability distribution functions, this paper develops a novel solution method for this challenging CCO problem. The proposed meth-od is computationally effective for mainly two reasons. First, the original AC power flow constraints are approximated by a set of learning-assisted quadratic convex inequalities based on a generalized least absolute shrinkage and selection operator. Second, considering the physical patterns of data and motived by learning-based sampling, the strategic sampling method is developed to significantly reduce the required number of scenarios through different sampling strategies. The simulation results on IEEE standard systems indicate that 1) the proposed strategic sampling significantly improves the computational efficiency of the scenario-based approach for solving the chance-constrained optimal PSO problem, 2) the data-driven convex approximation of power flow can be promising alternatives of nonlinear and nonconvex AC power flow.


Positively Weighted Kernel Quadrature via Subsampling

arXiv.org Machine Learning

We study kernel quadrature rules with positive weights for probability measures on general domains. Our theoretical analysis combines the spectral properties of the kernel with random sampling of points. This results in effective algorithms to construct kernel quadrature rules with positive weights and small worst-case error. Besides additional robustness, our numerical experiments indicate that this can achieve fast convergence rates that compete with the optimal bounds in well-known examples.


Learning MR-Sort Models from Non-Monotone Data

arXiv.org Artificial Intelligence

The Majority Rule Sorting (MR-Sort) method assigns alternatives evaluated on multiple criteria to one of the predefined ordered categories. The Inverse MR-Sort problem (Inv-MR-Sort) computes MR-Sort parameters that match a dataset. Existing learning algorithms for Inv-MR-Sort consider monotone preferences on criteria. We extend this problem to the case where the preferences on criteria are not necessarily monotone, but possibly single-peaked (or single-valley). We propose a mixed-integer programming based algorithm that learns the preferences on criteria together with the other MR-Sort parameters from the training data. We investigate the performance of the algorithm using numerical experiments and we illustrate its use on a real-world case study.


Accelerating Edge Intelligence via Integrated Sensing and Communication

arXiv.org Artificial Intelligence

Realizing edge intelligence consists of sensing, communication, training, and inference stages. Conventionally, the sensing and communication stages are executed sequentially, which results in excessive amount of dataset generation and uploading time. This paper proposes to accelerate edge intelligence via integrated sensing and communication (ISAC). As such, the sensing and communication stages are merged so as to make the best use of the wireless signals for the dual purpose of dataset generation and uploading. However, ISAC also introduces additional interference between sensing and communication functionalities. To address this challenge, this paper proposes a classification error minimization formulation to design the ISAC beamforming and time allocation. Globally optimal solution is derived via the rank-1 guaranteed semidefinite relaxation, and performance analysis is performed to quantify the ISAC gain. Simulation results are provided to verify the effectiveness of the proposed ISAC scheme. Interestingly, it is found that when the sensing time dominates the communication time, ISAC is always beneficial. However, when the communication time dominates, the edge intelligence with ISAC scheme may not be better than that with the conventional scheme, since ISAC introduces harmful interference between the sensing and communication signals.


Wave-Informed Matrix Factorization withGlobal Optimality Guarantees

arXiv.org Machine Learning

With the recent success of representation learning methods, which includes deep learning as a special case, there has been considerable interest in developing representation learning techniques that can incorporate known physical constraints into the learned representation. As one example, in many applications that involve a signal propagating through physical media (e.g., optics, acoustics, fluid dynamics, etc), it is known that the dynamics of the signal must satisfy constraints imposed by the wave equation. Here we propose a matrix factorization technique that decomposes such signals into a sum of components, where each component is regularized to ensure that it satisfies wave equation constraints. Although our proposed formulation is non-convex, we prove that our model can be efficiently solved to global optimality in polynomial time. We demonstrate the benefits of our work by applications in structural health monitoring, where prior work has attempted to solve this problem using sparse dictionary learning approaches that do not come with any theoretical guarantees regarding convergence to global optimality and employ heuristics to capture desired physical constraints.


Latency-Memory Optimized Splitting of Convolution Neural Networks for Resource Constrained Edge Devices

arXiv.org Artificial Intelligence

With the increasing reliance of users on smart devices, bringing essential computation at the edge has become a crucial requirement for any type of business. Many such computations utilize Convolution Neural Networks (CNNs) to perform AI tasks, having high resource and computation requirements, that are infeasible for edge devices. Splitting the CNN architecture to perform part of the computation on edge and remaining on the cloud is an area of research that has seen increasing interest in the field. In this paper, we assert that running CNNs between an edge device and the cloud is synonymous to solving a resource-constrained optimization problem that minimizes the latency and maximizes resource utilization at the edge. We formulate a multi-objective optimization problem and propose the LMOS algorithm to achieve a Pareto efficient solution. Experiments done on real-world edge devices show that, LMOS ensures feasible execution of different CNN models at the edge and also improves upon existing state-of-the-art approaches.


Can we globally optimize cross-validation loss? Quasiconvexity in ridge regression

arXiv.org Machine Learning

Models like LASSO and ridge regression are extensively used in practice due to their interpretability, ease of use, and strong theoretical guarantees. Cross-validation (CV) is widely used for hyperparameter tuning in these models, but do practical optimization methods minimize the true out-of-sample loss? A recent line of research promises to show that the optimum of the CV loss matches the optimum of the out-of-sample loss (possibly after simple corrections). It remains to show how tractable it is to minimize the CV loss. In the present paper, we show that, in the case of ridge regression, the CV loss may fail to be quasiconvex and thus may have multiple local optima. We can guarantee that the CV loss is quasiconvex in at least one case: when the spectrum of the covariate matrix is nearly flat and the noise in the observed responses is not too high. More generally, we show that quasiconvexity status is independent of many properties of the observed data (response norm, covariate-matrix right singular vectors and singular-value scaling) and has a complex dependence on the few that remain. We empirically confirm our theory using simulated experiments.


Inference for Change Points in High Dimensional Mean Shift Models

arXiv.org Machine Learning

Detection of change points constitutes a canonical statistical problem due to numerous applications in diverse areas, including economics and finance (Basseville et al. [1993], Frisén [2008]), quality process control (Qiu [2013]), functional genomics and neuroscience (Koepcke et al. [2016]). The offline version of the problem, wherein one examines the data retrospectively and aims to detect the presence and/or location of change points has been studied extensively for a variety of statistical models, including signal plus noise, regression, graphical, random graph, factor and time series models and various algorithms have been developed to accomplish this task -dynamic programming, regularized cost functions, binary segmentation, multiscale methods, etc., see, e.g. the review article Niu et al. [2016]. In the presence of multiple change points, consistency of the estimated location of the change points under certain regularity assumptions on the temporal spacing between change points and on the magnitude of the changes in the underlying model parameters have been established, see, e.g. Fryzlewicz [2014], Frick et al. [2014] and Wang and Samworth [2018] amongst several others, here the former two are under a fixed p framework and the latter under a high dimensional framework. Further, when a single change point has been assumed, the asymptotic distribution of the change point estimator has been established for various statistical models, see, e.g., [Bai, 1994, 1997], Csorgo and Horváth [1997], under fixed p setting, and [Bhattacharjee et al., 2017, 2019], [Kaul et al., 2020, 2021], under diverging dimensionality, where the last two articles allow potential high dimensionality.


High-Dimensional Simulation Optimization via Brownian Fields and Sparse Grids

arXiv.org Machine Learning

High-dimensional simulation optimization is notoriously challenging. We propose a new sampling algorithm that converges to a global optimal solution and suffers minimally from the curse of dimensionality. The algorithm consists of two stages. First, we take samples following a sparse grid experimental design and approximate the response surface via kernel ridge regression with a Brownian field kernel. Second, we follow the expected improvement strategy -- with critical modifications that boost the algorithm's sample efficiency -- to iteratively sample from the next level of the sparse grid. Under mild conditions on the smoothness of the response surface and the simulation noise, we establish upper bounds on the convergence rate for both noise-free and noisy simulation samples. These upper bounds deteriorate only slightly in the dimension of the feasible set, and they can be improved if the objective function is known to be of a higher-order smoothness. Extensive numerical experiments demonstrate that the proposed algorithm dramatically outperforms typical alternatives in practice.