Asia
Sparse $ε$ insensitive zone bounded asymmetric elastic net support vector machines for pattern classification
Existing support vector machines(SVM) models are sensitive to noise and lack sparsity, which limits their performance. To address these issues, we combine the elastic net loss with a robust loss framework to construct a sparse $\varepsilon$-insensitive bounded asymmetric elastic net loss, and integrate it with SVM to build $\varepsilon$ Insensitive Zone Bounded Asymmetric Elastic Net Loss-based SVM($\varepsilon$-BAEN-SVM). $\varepsilon$-BAEN-SVM is both sparse and robust. Sparsity is proven by showing that samples inside the $\varepsilon$-insensitive band are not support vectors. Robustness is theoretically guaranteed because the influence function is bounded. To solve the non-convex optimization problem, we design a half-quadratic algorithm based on clipping dual coordinate descent. It transforms the problem into a series of weighted subproblems, improving computational efficiency via the $\varepsilon$ parameter. Experiments on simulated and real datasets show that $\varepsilon$-BAEN-SVM outperforms traditional and existing robust SVMs. It balances sparsity and robustness well in noisy environments. Statistical tests confirm its superiority. Under the Gaussian kernel, it achieves better accuracy and noise insensitivity, validating its effectiveness and practical value.
Claude Mythos Is Everyone's Problem
What happens when AI can hack everything? For the past several weeks, Anthropic says it secretly possessed a tool potentially capable of commandeering most computer servers in the world. This is a bot that, if unleashed, might be able to hack into banks, exfiltrate state secrets, and fry crucial infrastructure. Already, according to the company, this AI model has identified thousands of major cybersecurity vulnerabilities--including exploits in every single major operating system and browser. This level of cyberattack is typically available only to elite, state-sponsored hacking cells in a very small number of countries including China, Russia, and the United States.
AI-pocalypse: Anthropic sparks fears after developing a bot that's 'too dangerous to release to the public'
New Jersey man's chilling'cancer map' fuels fears of poisoned neighborhood with 41 cases and counting Three stocks are high as a kite after Trump's wild executive order as investors rush to cash in New'Hollywood dose' pill: A-listers hooked on'youth elixir' that dermatologists say is anti-ageing, shrinks pores, smooths wrinkles... and even banishes rosacea Days after we got engaged, the love of my life told me he'd killed a man and buried him in a bog. I reported him to police... but then I made this irreversible mistake Papa John's under fire for an outrageous message now printed on all pizza boxes Iran vows to put'new cards on the battlefield' after Trump breaks ceasefire as Vance travels to Pakistan for peace talks before deadline ends TODAY NASA's return of humans to the Moon in 2028 faces alarming setback California coffee farmers nearly escaped death before'tragic accident' as autopsy reveals disturbing new details How to lose weight when perimenopause sabotages your metabolism: I'm a PT but when I hit 46, I piled on the pounds overnight. Australia has spoken: Report reveals what everyone is thinking about Prince Harry and Meghan Markle's Australia tour Humiliating moment runner celebrates winning marathon... only to be pipped at the line by rival in brutal finish How prophet of extreme Mormon cult who had 20 wives - some aged just 10 - is now spreading evil from prison, as woman who bravely exposed him reveals new threat Netflix doc missed and'sister brides' still under his thrall Even Cameron Diaz admits she's a dirty mess. I'll get hate for saying it, but we're all thinking the same thing about THAT wrinkled forehead: CAROLINE BULLOCK Two high school sweethearts survived the Columbine High School massacre. Months later, they were gunned down in a Subway on Valentine's Day in a crime that remains unsolved AI-pocalypse: Anthropic sparks fears after developing a bot that's'too dangerous to release to the public' Anthropic has sparked fears after revealing that it has developed an AI bot deemed too dangerous to release to the public.
Tight Convergence Rates for Online Distributed Linear Estimation with Adversarial Measurements
Roy, Nibedita, Halder, Vishal, Thoppe, Gugan, Reiffers-Masson, Alexandre, Dhanakshirur, Mihir, Naman, null, Azor, Alexandre
We study mean estimation of a random vector $X$ in a distributed parameter-server-worker setup. Worker $i$ observes samples of $a_i^\top X$, where $a_i^\top$ is the $i$th row of a known sensing matrix $A$. The key challenges are adversarial measurements and asynchrony: a fixed subset of workers may transmit corrupted measurements, and workers are activated asynchronously--only one is active at any time. In our previous work, we proposed a two-timescale $\ell_1$-minimization algorithm and established asymptotic recovery under a null-space-property-like condition on $A$. In this work, we establish tight non-asymptotic convergence rates under the same null-space-property-like condition. We also identify relaxed conditions on $A$ under which exact recovery may fail but recovery of a projected component of $\mathbb{E}[X]$ remains possible. Overall, our results provide a unified finite-time characterization of robustness, identifiability, and statistical efficiency in distributed linear estimation with adversarial workers, with implications for network tomography and related distributed sensing problems.
Generalization error bounds for two-layer neural networks with Lipschitz loss function
Nguwi, Jiang Yu, Privault, Nicolas
We derive generalization error bounds for the training of two-layer neural networks without assuming boundedness of the loss function, using Wasserstein distance estimates on the discrepancy between a probability distribution and its associated empirical measure, together with moment bounds for the associated stochastic gradient method. In the case of independent test data, we obtain a dimension-free rate of order $O(n^{-1/2} )$ on the $n$-sample generalization error, whereas without independence assumption, we derive a bound of order $O(n^{-1 / ( d_{\rm in}+d_{\rm out} )} )$, where $d_{\rm in}$, $d_{\rm out}$ denote input and output dimensions. Our bounds and their coefficients can be explicitly computed prior to the training of the model, and are confirmed by numerical simulations.
Bi-Lipschitz Autoencoder With Injectivity Guarantee
Zhan, Qipeng, Zhou, Zhuoping, Wang, Zexuan, Long, Qi, Shen, Li
Autoencoders are widely used for dimensionality reduction, based on the assumption that high-dimensional data lies on low-dimensional manifolds. Regularized autoencoders aim to preserve manifold geometry during dimensionality reduction, but existing approaches often suffer from non-injective mappings and overly rigid constraints that limit their effectiveness and robustness. In this work, we identify encoder non-injectivity as a core bottleneck that leads to poor convergence and distorted latent representations. To ensure robustness across data distributions, we formalize the concept of admissible regularization and provide sufficient conditions for its satisfaction. In this work, we propose the Bi-Lipschitz Autoencoder (BLAE), which introduces two key innovations: (1) an injective regularization scheme based on a separation criterion to eliminate pathological local minima, and (2) a bi-Lipschitz relaxation that preserves geometry and exhibits robustness to data distribution drift. Empirical results on diverse datasets show that BLAE consistently outperforms existing methods in preserving manifold structure while remaining resilient to sampling sparsity and distribution shifts. Code is available at https://github.com/qipengz/BLAE.
Time Series Gaussian Chain Graph Models
Fang, Qin, Qiao, Xinghao, Wang, Zihan
Time series graphical models have recently received considerable attention for characterizing (conditional) dependence structures in multivariate time series. In many applications, the multivariate series exhibit variable-partitioned blockwise dependence, with distinct patterns within and across blocks. In this paper, we introduce a new class of time series Gaussian chain graph models that represent contemporaneous and lagged causal relations via directed edges across blocks, while capturing within-block conditional dependencies through undirected edges. In the frequency domain, this formulation induces a cross-frequency shared group sparse plus group low-rank decomposition of the inverse spectral density matrices, which we exploit to establish identifiability of the time series chain graph structure. Building on this, we then propose a three-stage learning procedure for estimating the undirected and directed edge sets, which involves optimizing a regularized Whittle likelihood with a group lasso penalty to encourage group sparsity and a novel tensor-unfolding nuclear norm penalty to enforce group low-rank structure. We investigate the asymptotic properties of the proposed method, ensuring its consistency for exact recovery of the chain graph structure. The superior empirical performance of the proposed method is demonstrated through both extensive simulation studies and an application to U.S. macroeconomic data that highlights key monetary policy transmission mechanisms.
Amortized Filtering and Smoothing with Conditional Normalizing Flows
Cui, Tiangang, Feng, Xiaodong, Pei, Chenlong, Wan, Xiaoliang, Zhou, Tao
Bayesian filtering and smoothing for high-dimensional nonlinear dynamical systems are fundamental yet challenging problems in many areas of science and engineering. In this work, we propose AFSF, a unified amortized framework for filtering and smoothing with conditional normalizing flows. The core idea is to encode each observation history into a fixed-dimensional summary statistic and use this shared representation to learn both a forward flow for the filtering distribution and a backward flow for the backward transition kernel. Specifically, a recurrent encoder maps each observation history to a fixed-dimensional summary statistic whose dimension does not depend on the length of the time series. Conditioned on this shared summary statistic, the forward flow approximates the filtering distribution, while the backward flow approximates the backward transition kernel. The smoothing distribution over an entire trajectory is then recovered by combining the terminal filtering distribution with the learned backward flow through the standard backward recursion. By learning the underlying temporal evolution structure, AFSF also supports extrapolation beyond the training horizon. Moreover, by coupling the two flows through shared summary statistics, AFSF induces an implicit regularization across latent state trajectories and improves trajectory-level smoothing. In addition, we develop a flow-based particle filtering variant that provides an alternative filtering procedure and enables ESS-based diagnostics when explicit model factors are available. Numerical experiments demonstrate that AFSF provides accurate approximations of both filtering distributions and smoothing paths.
Weighted Bayesian Conformal Prediction
Conformal prediction provides distribution-free prediction intervals with finite-sample coverage guarantees, and recent work by Snell \& Griffiths reframes it as Bayesian Quadrature (BQ-CP), yielding powerful data-conditional guarantees via Dirichlet posteriors over thresholds. However, BQ-CP fundamentally requires the i.i.d. assumption -- a limitation the authors themselves identify. Meanwhile, weighted conformal prediction handles distribution shift via importance weights but remains frequentist, producing only point-estimate thresholds. We propose \textbf{Weighted Bayesian Conformal Prediction (WBCP)}, which generalizes BQ-CP to arbitrary importance-weighted settings by replacing the uniform Dirichlet $\Dir(1,\ldots,1)$ with a weighted Dirichlet $\Dir(\neff \cdot \tilde{w}_1, \ldots, \neff \cdot \tilde{w}_n)$, where $\neff$ is Kish's effective sample size. We prove four theoretical results: (1)~$\neff$ is the unique concentration parameter matching frequentist and Bayesian variances; (2)~posterior standard deviation decays as $O(1/\sqrt{\neff})$; (3)~BQ-CP's stochastic dominance guarantee extends to per-weight-profile data-conditional guarantees; (4)~the HPD threshold provides $O(1/\sqrt{\neff})$ improvement in conditional coverage. We instantiate WBCP for spatial prediction as \emph{Geographical BQ-CP}, where kernel-based spatial weights yield per-location posteriors with interpretable diagnostics. Experiments on synthetic and real-world spatial datasets demonstrate that WBCP maintains coverage guarantees while providing substantially richer uncertainty information.
The Theory and Practice of Highly Scalable Gaussian Process Regression with Nearest Neighbours
Allison, Robert, Maciazek, Tomasz, Stephenson, Anthony
Gaussian process ($GP$) regression is a widely used non-parametric modeling tool, but its cubic complexity in the training size limits its use on massive data sets. A practical remedy is to predict using only the nearest neighbours of each test point, as in Nearest Neighbour Gaussian Process ($NNGP$) regression for geospatial problems and the related scalable $GPnn$ method for more general machine-learning applications. Despite their strong empirical performance, the large-$n$ theory of $NNGP/GPnn$ remains incomplete. We develop a theoretical framework for $NNGP$ and $GPnn$ regression. Under mild regularity assumptions, we derive almost sure pointwise limits for three key predictive criteria: mean squared error ($MSE$), calibration coefficient ($CAL$), and negative log-likelihood ($NLL$). We then study the $L_2$-risk, prove universal consistency, and show that the risk attains Stone's minimax rate $n^{-2α/(2p+d)}$, where $α$ and $p$ capture regularity of the regression problem. We also prove uniform convergence of $MSE$ over compact hyper-parameter sets and show that its derivatives with respect to lengthscale, kernel scale, and noise variance vanish asymptotically, with explicit rates. This explains the observed robustness of $GPnn$ to hyper-parameter tuning. These results provide a rigorous statistical foundation for $NNGP/GPnn$ as a highly scalable and principled alternative to full $GP$ models.