AITopics

We formalize the problem of machine unlearning as design of efficient unlearning algorithms corresponding to learning algorithms which perform a selection of adaptive queries from structured query classes. We give efficient unlearning algorithms for linear and prefix-sum query classes. As applications, we show that unlearning in many problems, in particular, stochastic convex optimization (SCO), can be reduced to the above, yielding improved guarantees for the problem. In particular, for smooth Lipschitz losses and any $\rho>0$, our results yield an unlearning algorithm with excess population risk of $\tilde O\big(\frac{1}{\sqrt{n}}+\frac{\sqrt{d}}{n\rho}\big)$ with unlearning query (gradient) complexity $\tilde O(\rho \cdot \text{Retraining Complexity})$, where $d$ is the model dimensionality and $n$ is the initial number of samples. For non-smooth Lipschitz losses, we give an unlearning algorithm with excess population risk $\tilde O\big(\frac{1}{\sqrt{n}}+\big(\frac{\sqrt{d}}{n\rho}\big)^{1/2}\big)$ with the same unlearning query (gradient) complexity. Furthermore, in the special case of Generalized Linear Models (GLMs), such as those in linear and logistic regression, we get dimension-independent rates of $\tilde O\big(\frac{1}{\sqrt{n}} +\frac{1}{(n\rho)^{2/3}}\big)$ and $\tilde O\big(\frac{1}{\sqrt{n}} +\frac{1}{(n\rho)^{1/3}}\big)$ for smooth Lipschitz and non-smooth Lipschitz losses respectively. Finally, we give generalizations of the above from one unlearning request to \textit{dynamic} streams consisting of insertions and deletions.

algorithm, artificial intelligence, machine learning, (17 more...)

2307.11228

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Vibho, Amrutaa, Bataineh, Ali Al

NeoSySPArtaN: A Neuro-Symbolic Spin Prediction Architecture for higher-order multipole waveforms from eccentric Binary Black Hole mergers using Numerical Relativity

The prediction of spin magnitudes in binary black hole and neutron star mergers is crucial for understanding the astrophysical processes and gravitational wave (GW) signals emitted during these cataclysmic events. In this paper, we present a novel Neuro-Symbolic Architecture (NSA) that combines the power of neural networks and symbolic regression to accurately predict spin magnitudes of black hole and neutron star mergers. Our approach utilizes GW waveform data obtained from numerical relativity simulations in the SXS Waveform catalog. By combining these two approaches, we leverage the strengths of both paradigms, enabling a comprehensive and accurate prediction of spin magnitudes. Our experiments demonstrate that the proposed architecture achieves an impressive root-mean-squared-error (RMSE) of 0.05 and mean-squared-error (MSE) of 0.03 for the NSA model and an RMSE of 0.12 for the symbolic regression model alone. We train this model to handle higher-order multipole waveforms, with a specific focus on eccentric candidates, which are known to exhibit unique characteristics. Our results provide a robust and interpretable framework for predicting spin magnitudes in mergers. This has implications for understanding the astrophysical properties of black holes and deciphering the physics underlying the GW signals.

artificial intelligence, machine learning, waveform, (13 more...)

2307.11003

Country:

North America > United States (0.49)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.70)

Industry: Government > Regional Government > North America Government > United States Government (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
(2 more...)

Richman, Ronald, Wüthrich, Mario V.

Conditional expectation network for SHAP

A very popular model-agnostic technique for explaining predictive models is the SHapley Additive exPlanation (SHAP). The two most popular versions of SHAP are a conditional expectation version and an unconditional expectation version (the latter is also known as interventional SHAP). Except for tree-based methods, usually the unconditional version is used (for computational reasons). We provide a (surrogate) neural network approach which allows us to efficiently calculate the conditional version for both neural networks and other regression models, and which properly considers the dependence structure in the feature components. This proposal is also useful to provide drop1 and anova analyses in complex regression models which are similar to their generalized linear model (GLM) counterparts, and we provide a partial dependence plot (PDP) counterpart that considers the right dependence structure in the feature components.

artificial intelligence, conditional expectation, machine learning, (19 more...)

2307.10654

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Africa > South Africa > Gauteng > Johannesburg (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.57)

Belyaeva, Anastasiya, Cosentino, Justin, Hormozdiari, Farhad, Eswaran, Krish, Shetty, Shravya, Corrado, Greg, Carroll, Andrew, McLean, Cory Y., Furlotte, Nicholas A.

Multimodal LLMs for health grounded in individual-specific data

Foundation large language models (LLMs) have shown an impressive ability to solve tasks across a wide range of fields including health. To effectively solve personalized health tasks, LLMs need the ability to ingest a diversity of data modalities that are relevant to an individual's health status. In this paper, we take a step towards creating multimodal LLMs for health that are grounded in individual-specific data by developing a framework (HeLM: Health Large Language Model for Multimodal Understanding) that enables LLMs to use high-dimensional clinical modalities to estimate underlying disease risk. HeLM encodes complex data modalities by learning an encoder that maps them into the LLM's token embedding space and for simple modalities like tabular data by serializing the data into text. Using data from the UK Biobank, we show that HeLM can effectively use demographic and clinical features in addition to high-dimensional time-series data to estimate disease risk. For example, HeLM achieves an AUROC of 0.75 for asthma prediction when combining tabular and spirogram data modalities compared with 0.49 when only using tabular data. Overall, we find that HeLM outperforms or performs at parity with classical machine learning approaches across a selection of eight binary traits. Furthermore, we investigate the downstream uses of this model such as its generalizability to out-of-distribution traits and its ability to power conversations around individual health and wellness.

large language model, machine learning, natural language, (19 more...)

2307.09018

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Uruguay > Artigas > Artigas (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre:

Research Report > Experimental Study (0.50)
Research Report > New Finding (0.32)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Health Care Technology (0.93)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Huang, Po-Wei, Rebentrost, Patrick

Post-variational quantum neural networks

arXiv.org Artificial IntelligenceJul-19-2023

Quantum computing has the potential to provide substantial computational advantages over current state-of-the-art classical supercomputers. However, current hardware is not advanced enough to execute fault-tolerant quantum algorithms. An alternative of using hybrid quantum-classical computing with variational algorithms can exhibit barren plateau issues, causing slow convergence of gradient-based optimization techniques. In this paper, we discuss "post-variational strategies", which shift tunable parameters from the quantum computer to the classical computer, opting for ensemble strategies when optimizing quantum models. We discuss various strategies and design principles for constructing individual quantum circuits, where the resulting ensembles can be optimized with convex programming. Further, we discuss architectural designs of post-variational quantum neural networks and analyze the propagation of estimation errors throughout such neural networks. Lastly, we show that our algorithm can be applied to real-world applications such as image classification on handwritten digits, producing a 96% classification accuracy.

algorithm, artificial intelligence, machine learning, (17 more...)

2307.1056

Country:

Asia > Singapore (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Construction & Engineering (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

arXiv.org Machine LearningJul-18-2023

The Connection Between R-Learning and Inverse-Variance Weighting for Estimation of Heterogeneous Treatment Effects

Fisher, Aaron

Our motivation is to shed light the performance of the widely popular "R-Learner." Like many other methods for estimating conditional average treatment effects (CATEs), R-Learning can be expressed as a weighted pseudo-outcome regression (POR). Previous comparisons of POR techniques have paid careful attention to the choice of pseudo-outcome transformation. However, we argue that the dominant driver of performance is actually the choice of weights. Specifically, we argue that R-Learning implicitly performs an inverse-variance weighted form of POR. These weights stabilize the regression and allow for convenient simplifications of bias terms.

artificial intelligence, machine learning, regression, (19 more...)

arXiv.org Machine Learning

2307.097

Country:

North America > United States > Virginia (0.04)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Behdin, Kayhan, Chen, Wenyu, Mazumder, Rahul

Sparse Gaussian Graphical Models with Discrete Optimization: Computational and Statistical Perspectives

arXiv.org Artificial IntelligenceJul-18-2023

We consider the problem of learning a sparse graph underlying an undirected Gaussian graphical model, a key problem in statistical machine learning. Given $n$ samples from a multivariate Gaussian distribution with $p$ variables, the goal is to estimate the $p \times p$ inverse covariance matrix (aka precision matrix), assuming it is sparse (i.e., has a few nonzero entries). We propose GraphL0BnB, a new estimator based on an $\ell_0$-penalized version of the pseudolikelihood function, while most earlier approaches are based on the $\ell_1$-relaxation. Our estimator can be formulated as a convex mixed integer program (MIP) which can be difficult to compute at scale using off-the-shelf commercial solvers. To solve the MIP, we propose a custom nonlinear branch-and-bound (BnB) framework that solves node relaxations with tailored first-order methods. As a by-product of our BnB framework, we propose large-scale solvers for obtaining good primal solutions that are of independent interest. We derive novel statistical guarantees (estimation and variable selection) for our estimator and discuss how our approach improves upon existing estimators. Our numerical experiments on real/synthetic datasets suggest that our method can solve, to near-optimality, problem instances with $p = 10^4$ -- corresponding to a symmetric matrix of size $p \times p$ with $p^2/2$ binary variables. We demonstrate the usefulness of GraphL0BnB versus various state-of-the-art approaches on a range of datasets.

algorithm 1, artificial intelligence, machine learning, (17 more...)

2307.09366

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Promising Solution (0.65)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Cacciarelli, Davide, Kulahci, Murat, Tyssedal, John Sølve

Robust online active learning

arXiv.org Artificial IntelligenceJul-18-2023

In many industrial applications, obtaining labeled observations is not straightforward as it often requires the intervention of human experts or the use of expensive testing equipment. In these circumstances, active learning can be highly beneficial in suggesting the most informative data points to be used when fitting a model. Reducing the number of observations needed for model development alleviates both the computational burden required for training and the operational expenses related to labeling. Online active learning, in particular, is useful in high-volume production processes where the decision about the acquisition of the label for a data point needs to be taken within an extremely short time frame. However, despite the recent efforts to develop online active learning strategies, the behavior of these methods in the presence of outliers has not been thoroughly examined. In this work, we investigate the performance of online active linear regression in contaminated data streams. Our study shows that the currently available query strategies are prone to sample outliers, whose inclusion in the training set eventually degrades the predictive performance of the models. To address this issue, we propose a solution that bounds the search area of a conditional D-optimal algorithm and uses a robust estimator. Our approach strikes a balance between exploring unseen regions of the input space and protecting against outliers. Through numerical simulations, we show that the proposed method is effective in improving the performance of online active learning in the presence of outliers, thus expanding the potential applications of this powerful tool.

artificial intelligence, machine learning, outlier, (17 more...)

doi: 10.1002/qre.3392

2302.00422

Country:

Europe > Sweden > Norrbotten County > Luleå (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Norway > Central Norway > Trøndelag > Trondheim (0.04)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Genre:

Instructional Material > Online (1.00)
Research Report > New Finding (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Roy, Saptarshi, Tewari, Ambuj, Zhu, Ziwei

Understanding Best Subset Selection: A Tale of Two C(omplex)ities

arXiv.org Machine LearningJul-17-2023

For decades, best subset selection (BSS) has eluded statisticians mainly due to its computational bottleneck. However, until recently, modern computational breakthroughs have rekindled theoretical interest in BSS and have led to new findings. Recently, \cite{guo2020best} showed that the model selection performance of BSS is governed by a margin quantity that is robust to the design dependence, unlike modern methods such as LASSO, SCAD, MCP, etc. Motivated by their theoretical results, in this paper, we also study the variable selection properties of best subset selection for high-dimensional sparse linear regression setup. We show that apart from the identifiability margin, the following two complexity measures play a fundamental role in characterizing the margin condition for model consistency: (a) complexity of \emph{residualized features}, (b) complexity of \emph{spurious projections}. In particular, we establish a simple margin condition that depends only on the identifiability margin and the dominating one of the two complexity measures. Furthermore, we show that a margin condition depending on similar margin quantity and complexity measures is also necessary for model consistency of BSS. For a broader understanding, we also consider some simple illustrative examples to demonstrate the variation in the complexity measures that refines our theoretical understanding of the model selection performance of BSS under different correlation structures.

artificial intelligence, complexity, machine learning, (17 more...)

arXiv.org Machine Learning

2301.06259

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

arXiv.org Artificial IntelligenceJul-17-2023

On-the-fly machine learning for parametrization of the effective Hamiltonian

Ma, Xingyue, Bellaiche, L., Wu, Di, Yang, Yurong

The first-principles-based effective Hamiltonian is widely used to predict and simulate the properties of ferroelectrics and relaxor ferroelectrics. However, the parametrization method of the effective Hamiltonian is complicated and hardly can resolve the systems with complex interactions and/or complex components. Here, we developed an on-the-fly machine learning approach to parametrize the effective Hamiltonian based on Bayesian linear regression. The parametrization is completed in molecular dynamics simulations, with the energy, forces and stress predicted at each step along with their uncertainties. First-principles calculations are executed when the uncertainties are large to retrain the parameters. This approach provides a universal and automatic way to compute the effective Hamiltonian parameters for any considered systems including complex systems which previous methods can not handle. BaTiO3 and Pb(Sc,Ta)O3 are taken as examples to show the accurateness of this approach comparing with conventional first-principles parametrization method.

artificial intelligence, calculation, machine learning, (17 more...)

2307.08929

Country:

North America > United States > Arkansas > Washington County > Fayetteville (0.14)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)