Goto

Collaborating Authors

 pproach


Approach to Finding a Robust Deep Learning Model

arXiv.org Artificial Intelligence

The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of large numbers of models. This growing demand highlights the importance of training models without human supervision, while ensuring that their predictions are reliable. In response to this need, we propose a novel approach for determining model robustness. This approach, supplemented with a proposed model selection algorithm designed as a meta-algorithm, is versatile and applicable to any machine learning model, provided that it is appropriate for the task at hand. This study demonstrates the application of our approach to evaluate the robustness of deep learning models. To this end, we study small models composed of a few convolutional and fully connected layers, using common optimizers due to their ease of interpretation and computational efficiency. Within this framework, we address the influence of training sample size, model weight initialization, and inductive bias on the robustness of deep learning models.


Do No Harm: A Counterfactual Approach to Safe Reinforcement Learning

arXiv.org Artificial Intelligence

Reinforcement Learning (RL) for control has become increasingly popular due to its ability to learn rich feedback policies that take into account uncertainty and complex representations of the environment. When considering safety constraints, constrained optimization approaches, where agents are penalized for constraint violations, are commonly used. In such methods, if agents are initialized in, or must visit, states where constraint violation might be inevitable, it is unclear how much they should be penalized. We address this challenge by formulating a constraint on the counterfactual harm of the learned policy compared to a default, safe policy. In a philosophical sense this formulation only penalizes the learner for constraint violations that it caused; in a practical sense it maintains feasibility of the optimal control problem. We present simulation studies on a rover with uncertain road friction and a tractor-trailer parking environment that demonstrate our constraint formulation enables agents to learn safer policies than contemporary constrained RL methods.


A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness

arXiv.org Artificial Intelligence

Accurate uncertainty quantification is a major challenge in deep learning, as neural networks can make overconfident errors and assign high confidence predictions to out-of-distribution (OOD) inputs. The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles. However their practicality in real-time, industrial-scale applications are limited due to the high memory and computational cost. Furthermore, ensembles and BNNs do not necessarily fix all the issues with the underlying member networks. In this work, we study principled approaches to improve the uncertainty property of a single network, based on a single, deterministic representation. By formalizing the uncertainty quantification as a minimax learning problem, we first identify distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data, as a necessary condition for a DNN to achieve highquality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs with two simple changes: (1) applying spectral normalization to hidden weights to enforce bi-Lipschitz smoothness in representations and (2) replacing the last output layer with a Gaussian process layer. On a suite of vision and language understanding benchmarks and on modern architectures (Wide-ResNet and BERT), SNGP consistently outperforms other single-model approaches in prediction, calibration and out-of-domain detection. Furthermore, SNGP provides complementary benefits to popular techniques such as deep ensembles and data augmentation, making it a simple and scalable building block for probabilistic deep learning.


Challenges and approaches to time-series forecasting in data center telemetry: A Survey

arXiv.org Artificial Intelligence

Time-series forecasting has been an important research domain for so many years. Its applications include ECG predictions, sales forecasting, weather conditions, even COVID-19 spread predictions. These applications have motivated many researchers to figure out an optimal forecasting approach, but the modeling approach also changes as the application domain changes. This work has focused on reviewing different forecasting approaches for telemetry data predictions collected at data centers. Forecasting of telemetry data is a critical feature of network and data center management products. However, there are multiple options of forecasting approaches that range from a simple linear statistical model to high capacity deep learning architectures. In this paper, we attempted to summarize and evaluate the performance of well known time series forecasting techniques. We hope that this evaluation provides a comprehensive summary to innovate in forecasting approaches for telemetry data.


Algebraic and Analytic Approaches for Parameter Learning in Mixture Models

arXiv.org Machine Learning

We present two different approaches for parameter learning in several mixture models in one dimension. Our first approach uses complex-analytic methods and applies to Gaussian mixtures with shared variance, binomial mixtures with shared success probability, and Poisson mixtures, among others. An example result is that $\exp(O(N^{1/3}))$ samples suffice to exactly learn a mixture of $k


Pattern-Based Approach to the Workflow Satisfiability Problem with User-Independent Constraints

Journal of Artificial Intelligence Research

The fixed parameter tractable (FPT) approach is a powerful tool in tackling computationally hard problems.  In this paper, we link FPT results to classic artificial intelligence (AI) techniques to show how they complement each other.  Specifically, we consider the workflow satisfiability problem (WSP) which asks whether there exists an assignment of authorised users to the steps in a workflow specification, subject to certain constraints on the assignment.  It was shown by Cohen et al. (JAIR 2014) that WSP restricted to the class of user-independent constraints (UI), covering many practical cases, admits FPT algorithms, i.e. can be solved in time exponential only in the number of steps k and polynomial in the number of users n.  Since usually k << n in WSP, such FPT algorithms are of great practical interest.We present a new interpretation of the FPT nature of the WSP with UI constraints giving a decomposition of the problem into two levels.  Exploiting this two-level split, we develop a new FPT algorithm that is by many orders of magnitude faster than the previous state-of-the-art WSP algorithm and also has only polynomial-space complexity.  We also introduce new pseudo-Boolean (PB) and Constraint Satisfaction (CSP) formulations of the WSP with UI constraints which efficiently exploit this new decomposition of the problem and raise the novel issue of how to use general-purpose solvers to tackle FPT problems in a fashion that meets FPT efficiency expectations.  In our computational study, we investigate, for the first time, the phase transition (PT) properties of the WSP, under a model for generation of random instances.  We show how PT studies can be extended, in a novel fashion, to support empirical evaluation of scaling of FPT algorithms.


A Utility-Theoretic Approach to Privacy in Online Services

Journal of Artificial Intelligence Research

Online offerings such as web search, news portals, and e-commerce applications face the challenge of providing high-quality service to a large, heterogeneous user base. Recent efforts have highlighted the potential to improve performance by introducing methods to personalize services based on special knowledge about users and their context. For example, a user's demographics, location, and past search and browsing may be useful in enhancing the results offered in response to web search queries. However, reasonable concerns about privacy by both users, providers, and government agencies acting on behalf of citizens, may limit access by services to such information. We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service. We focus on the example of web search and formulate realistic objective functions for search efficacy and privacy. We demonstrate how we can find a provably near-optimal optimization of the utility-privacy tradeoff in an efficient manner. We evaluate our methodology on data drawn from a log of the search activity of volunteer participants. We separately assess users preferences about privacy and utility via a large-scale survey, aimed at eliciting preferences about peoples willingness to trade the sharing of personal data in returns for gains in search efficiency. We show that a significant level of personalization can be achieved using a relatively small amount of information about users.