AITopics

2604.02889

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

New ScientistApr-3-2026, 12:00:55 GMT

How worried should you be about an AI apocalypse?

How worried should you be about an AI apocalypse? Fears that artificial intelligence could rise up to wipe out humanity are understandable given our steady diet of sci-fi stories depicting just that, but what is the real risk? Isaac Asimov's three laws of robotics are not a practical guide Super-intelligent artificial intelligence rising up and wiping out humanity has been a common trope in science fiction for decades. Now, we live in a world where real AI seems to be advancing faster than ever. Does that mean you should start worrying about an AI apocalypse?

artificial intelligence, close advertisement skip, scenario, (12 more...)

New Scientist

Country:

Europe > Ukraine > Kyiv Oblast > Chernobyl (0.06)
Asia > Middle East > Iran (0.05)

Industry:

Information Technology (0.49)
Energy (0.48)
Marketing (0.42)

Technology: Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)

Zarkadis, Iakovos-Christos, Douligeris, Christos

Machine Learning for Network Attacks Classification and Statistical Evaluation of Adversarial Learning Methodologies for Synthetic Data Generation

Supervised detection of network attacks has always been a critical part of network intrusion detection systems (NIDS). Nowadays, in a pivotal time for artificial intelligence (AI), with even more sophisticated attacks that utilize advanced techniques, such as generative artificial intelligence (GenAI) and reinforcement learning, it has become a vital component if we wish to protect our personal data, which are scattered across the web. In this paper, we address two tasks, in the first unified multi-modal NIDS dataset, which incorporates flow-level data, packet payload information and temporal contextual features, from the reprocessed CIC-IDS-2017, CIC-IoT-2023, UNSW-NB15 and CIC-DDoS-2019, with the same feature space. In the first task we use machine learning (ML) algorithms, with stratified cross validation, in order to prevent network attacks, with stability and reliability. In the second task we use adversarial learning algorithms to generate synthetic data, compare them with the real ones and evaluate their fidelity, utility and privacy using the SDV framework, f-divergences, distinguishability and non-parametric statistical tests. The findings provide stable ML models for intrusion detection and generative models with high fidelity and utility, by combining the Synthetic Data Vault framework, the TRTS and TSTR tests, with non-parametric statistical tests and f-divergence measures.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

2603.17717

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.48)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Ishikawa, Yuya, Tamano, Shu

PAC-Bayesian Reward-Certified Outcome Weighted Learning

Estimating optimal individualized treatment rules (ITRs) via outcome weighted learning (OWL) often relies on observed rewards that are noisy or optimistic proxies for the true latent utility. Ignoring this reward uncertainty leads to the selection of policies with inflated apparent performance, yet existing OWL frameworks lack the finite-sample guarantees required to systematically embed such uncertainty into the learning objective. To address this issue, we propose PAC-Bayesian Reward-Certified Outcome Weighted Learning (PROWL). Given a one-sided uncertainty certificate, PROWL constructs a conservative reward and a strictly policy-dependent lower bound on the true expected value. Theoretically, we prove an exact certified reduction that transforms robust policy learning into a unified, split-free cost-sensitive classification task. This formulation enables the derivation of a nonasymptotic PAC-Bayes lower bound for randomized ITRs, where we establish that the optimal posterior maximizing this bound is exactly characterized by a general Bayes update. To overcome the learning-rate selection problem inherent in generalized Bayesian inference, we introduce a fully automated, bounds-based calibration procedure, coupled with a Fisher-consistent certified hinge surrogate for efficient optimization. Our experiments demonstrate that PROWL achieves improvements in estimating robust, high-value treatment regimes under severe reward uncertainty compared to standard methods for ITR estimation.

artificial intelligence, baseline, machine learning, (18 more...)

2604.01946

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > New York > New York County > New York City (0.14)
Asia > Japan > Honshū > Chūbu > Toyama Prefecture > Toyama (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Koubbi, Hugo, Geshkovski, Borjan, Rigollet, Philippe

Homogenized Transformers

We study a random model of deep multi-head self-attention in which the weights are resampled independently across layers and heads, as at initialization of training. Viewing depth as a time variable, the residual stream defines a discrete-time interacting particle system on the unit sphere. We prove that, under suitable joint scalings of the depth, the residual step size, and the number of heads, this dynamics admits a nontrivial homogenized limit. Depending on the scaling, the limit is either deterministic or stochastic with common noise; in the mean-field regime, the latter leads to a stochastic nonlinear Fokker--Planck equation for the conditional law of a representative token. In the Gaussian setting, the limiting drift vanishes, making the homogenized dynamics explicit enough to study representation collapse. This yields quantitative trade-offs between dimension, context length, and temperature, and identifies regimes in which clustering can be mitigated.

artificial intelligence, borjangeshkovski, machine learning, (19 more...)

2604.01978

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Kar, Abhilash, Saha, Basisth, Sen, Tanmay, Pradhan, Biswabrata

BVFLMSP : Bayesian Vertical Federated Learning for Multimodal Survival with Privacy

Multimodal time-to-event prediction often requires integrating sensitive data distributed across multiple parties, making centralized model training impractical due to privacy constraints. At the same time, most existing multimodal survival models produce single deterministic predictions without indicating how confident the model is in its estimates, which can limit their reliability in real-world decision making. To address these challenges, we propose BVFLMSP, a Bayesian Vertical Federated Learning (VFL) framework for multimodal time-to-event analysis based on a Split Neural Network architecture. In BVFLMSP, each client independently models a specific data modality using a Bayesian neural network, while a central server aggregates intermediate representations to perform survival risk prediction. To enhance privacy, we integrate differential privacy mechanisms by perturbing client side representations before transmission, providing formal privacy guarantees against information leakage during federated training. We first evaluate our Bayesian multimodal survival model against widely used single modality survival baselines and the centralized multimodal baseline MultiSurv. Across multimodal settings, the proposed method shows consistent improvements in discrimination performance, with up to 0.02 higher C-index compared to MultiSurv. We then compare federated and centralized learning under varying privacy budgets across different modality combinations, highlighting the tradeoff between predictive performance and privacy. Experimental results show that BVFLMSP effectively includes multimodal data, improves survival prediction over existing baselines, and remains robust under strict privacy constraints while providing uncertainty estimates.

artificial intelligence, learning, machine learning, (18 more...)

2604.02248

Country: Asia > India > West Bengal > Kolkata (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Banking & Finance (0.68)
Information Technology > Security & Privacy (0.46)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Pastor, Dominique, Dupraz, Elsa, Hbilou, Ismail, Ansel, Guillaume

A Novel Theoretical Analysis for Clustering Heteroscedastic Gaussian Data without Knowledge of the Number of Clusters

This paper addresses the problem of clustering measurement vectors that are heteroscedastic in that they can have different covariance matrices. From the assumption that the measurement vectors within a given cluster are Gaussian distributed with possibly different and unknown covariant matrices around the cluster centroid, we introduce a novel cost function to estimate the centroids. The zeros of the gradient of this cost function turn out to be the fixed-points of a certain function. As such, the approach generalizes the methodology employed to derive the existing Mean-Shift algorithm. But as a main and novel theoretical result compared to Mean-Shift, this paper shows that the sole fixed-points of the identified function tend to be the cluster centroids if both the number of measurements per cluster and the distances between centroids are large enough. As a second contribution, this paper introduces the Wald kernel for clustering. This kernel is defined as the p-value of the Wald hypothesis test for testing the mean of a Gaussian. As such, the Wald kernel measures the plausibility that a measurement vector belongs to a given cluster and it scales better with the dimension of the measurement vectors than the usual Gaussian kernel. Finally, the proposed theoretical framework allows us to derive a new clustering algorithm called CENTRE-X that works by estimating the fixed-points of the identified function. As Mean-Shift, CENTRE-X requires no prior knowledge of the number of clusters. It relies on a Wald hypothesis test to significantly reduce the number of fixed points to calculate compared to the Mean-Shift algorithm, thus resulting in a clear gain in complexity. Simulation results on synthetic and real data sets show that CENTRE-X has comparable or better performance than standard clustering algorithms K-means and Mean-Shift, even when the covariance matrices are not perfectly known.

artificial intelligence, data mining, machine learning, (20 more...)

2604.01943

Country:

Europe > France > Brittany > Finistère > Brest (0.04)
Asia > China > Liaoning Province > Shenyang (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.48)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Random Coordinate Descent on the Wasserstein Space of Probability Measures

Xu, Yewei, Li, Qin

Optimization over the space of probability measures endowed with the Wasserstein-2 geometry is central to modern machine learning and mean-field modeling. However, traditional methods relying on full Wasserstein gradients often suffer from high computational overhead in high-dimensional or ill-conditioned settings. We propose a randomized coordinate descent framework specifically designed for the Wasserstein manifold, introducing both Random Wasserstein Coordinate Descent (RWCD) and Random Wasserstein Coordinate Proximal{-Gradient} (RWCP) for composite objectives. By exploiting coordinate-wise structures, our methods adapt to anisotropic objective landscapes where full-gradient approaches typically struggle. We provide a rigorous convergence analysis across various landscape geometries, establishing guarantees under non-convex, Polyak-Łojasiewicz, and geodesically convex conditions. Our theoretical results mirror the classic convergence properties found in Euclidean space, revealing a compelling symmetry between coordinate descent on vectors and on probability measures. The developed techniques are inherently adaptive to the Wasserstein geometry and offer a robust analytical template that can be extended to other optimization solvers within the space of measures. Numerical experiments on ill-conditioned energies demonstrate that our framework offers significant speedups over conventional full-gradient methods.

artificial intelligence, machine learning, probability measure, (15 more...)

2604.01606

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Test-Time Scaling Makes Overtraining Compute-Optimal

Roberts, Nicholas, Cho, Sungjun, Gao, Zhiqi, Huang, Tzu-Heng, Wu, Albert, Orlanski, Gabriel, Trost, Avi, Buchanan, Kelly, Albarghouthi, Aws, Sala, Frederic

Modern LLMs scale at test-time, e.g. via repeated sampling, where inference cost grows with model size and the number of samples. This creates a trade-off that pretraining scaling laws, such as Chinchilla, do not address. We present Train-to-Test ($T^2$) scaling laws that jointly optimize model size, training tokens, and number of inference samples under fixed end-to-end budgets. $T^2$ modernizes pretraining scaling laws with pass@$k$ modeling used for test-time scaling, then jointly optimizes pretraining and test-time decisions. Forecasts from $T^2$ are robust over distinct modeling approaches: measuring joint scaling effect on the task loss and modeling impact on task accuracy. Across eight downstream tasks, we find that when accounting for inference cost, optimal pretraining decisions shift radically into the overtraining regime, well-outside of the range of standard pretraining scaling suites. We validate our results by pretraining heavily overtrained models in the optimal region that $T^2$ scaling forecasts, confirming their substantially stronger performance compared to pretraining scaling alone. Finally, as frontier LLMs are post-trained, we show that our findings survive the post-training stage, making $T^2$ scaling meaningful in modern deployments.

large language model, machine learning, underreview, (17 more...)

2604.01411

Country:

Asia > Middle East > Jordan (0.04)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)

Bondar, Georgiy A., Eisenklam, Abigail, Cai, Yifan, Gifford, Robert, Sial, Tushar, Phan, Linh Thi Xuan, Halder, Abhishek

Generative Profiling for Soft Real-Time Systems and its Applications to Resource Allocation

Modern real-time systems require accurate characterization of task timing behavior to ensure predictable performance, particularly on complex hardware architectures. Existing methods, such as worst-case execution time analysis, often fail to capture the fine-grained timing behaviors of a task under varying resource contexts (e.g., an allocation of cache, memory bandwidth, and CPU frequency), which is necessary to achieve efficient resource utilization. In this paper, we introduce a novel generative profiling approach that synthesizes context-dependent, fine-grained timing profiles for real-time tasks, including those for unmeasured resource allocations. Our approach leverages a nonparametric, conditional multi-marginal Schrödinger Bridge (MSB) formulation to generate accurate execution profiles for unseen resource contexts, with maximum likelihood guarantees. We demonstrate the efficiency and effectiveness of our approach through real-world benchmarks, and showcase its practical utility in a representative case study of adaptive multicore resource allocation for real-time systems.

artificial intelligence, machine learning, real time system, (19 more...)

2604.01441

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Pennsylvania (0.04)
North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)