AITopics | Industry

Collaborating Authors

Industry

ImageSentinel: Protecting Visual Datasets from Unauthorized Retrieval-Augmented Image Generation

Neural Information Processing SystemsJun-11-2026, 01:04:29 GMT

The widespread adoption of Retrieval-Augmented Image Generation (RAIG) has raised significant concerns about the unauthorized use of private image datasets. While these systems have shown remarkable capabilities in enhancing generation quality through reference images, protecting visual datasets from unauthorized use in such systems remains a challenging problem. Traditional digital watermarking approaches face limitations in RAIG systems, as the complex feature extraction and recombination processes fail to preserve watermark signals during generation. To address these challenges, we propose ImageSentinel, a novel framework for protecting visual datasets in RAIG. Our framework synthesizes sentinel images that maintain visual consistency with the original dataset. These sentinels enable protection verification through randomly generated character sequences that serve as retrieval keys. To ensure seamless integration, we leverage vision-language models to generate the sentinel images. Experimental results demonstrate that ImageSentinel effectively detects unauthorized dataset usage while preserving generation quality for authorized applications.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.61)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.65)
Information Technology > Artificial Intelligence > Vision (0.65)
Information Technology > Artificial Intelligence > Machine Learning (0.45)

Add feedback

BO4Mob: Bayesian Optimization Benchmarks for High-Dimensional Urban Mobility Problem

Neural Information Processing SystemsJun-11-2026, 01:02:26 GMT

We introduce BO4Mob, a new benchmark framework for high-dimensional Bayesian Optimization (BO), driven by the challenge of origin-destination (OD) travel demand estimation in large urban road networks. Estimating OD travel demand from limited traffic sensor data is a difficult inverse optimization problem, particularly in real-world, large-scale transportation networks. This problem involves optimizing over high-dimensional continuous spaces where each objective evaluation is computationally expensive, stochastic, and non-differentiable. BO4Mob comprises five scenarios based on real-world San Jose, CA road networks, with input dimensions scaling up to 10,100. These scenarios utilize high-resolution, open-source traffic simulations that incorporate realistic nonlinear and stochastic dynamics. We demonstrate the benchmark's utility by evaluating five optimization methods: three state-of-the-art BO algorithms and two non-BO baselines. This benchmark is designed to support both the development of scalable optimization algorithms and their application for the design of data-driven urban mobility models, including high-resolution digital twins of metropolitan road networks.

artificial intelligence, optimization problem, proceedings, (6 more...)

Neural Information Processing Systems

Country: North America > United States > California > Santa Clara County > San Jose (0.27)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.97)

Add feedback

Teachers union facing years of deficits seeks to become Oregon's largest PAC, internal documents show

FOX NewsJun-11-2026, 01:00:36 GMT

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by LSEG .

artificial intelligence, fox new digital, social media, (10 more...)

FOX News

Country: North America > United States > Oregon (0.45)

Industry:

Media > News (1.00)
Government > Regional Government > North America Government > United States Government (0.96)

Technology:

Information Technology > Communications > Social Media (0.98)
Information Technology > Artificial Intelligence (0.70)

Add feedback

AI-Generated Video Detection via Perceptual Straightening

Neural Information Processing SystemsJun-11-2026, 00:02:19 GMT

The rapid advancement of generative AI enables highly realistic synthetic video, posing significant challenges for content authentication and raising urgent concerns about misuse.

artificial intelligence, machine learning, proceedings, (5 more...)

Neural Information Processing Systems

Industry: Information Technology > Security & Privacy (0.60)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.57)

Add feedback

Renewable Lasso without Batch-Number Constraints: A Gradient-Enhanced Approach

Gao, Junzhuo, Peng, Ling, Guo, Xu, Lian, Heng

arXiv.org Machine LearningJun-11-2026

We study online estimation for high-dimensional generalized linear models with streaming data. First, for the non-distributed setting, we propose a gradient-enhanced surrogate loss that approximates the cumulative loss using only historical summaries, which modifies and improves upon the existing renewable estimation approach for the same model in the high-dimensional setting, and removes the batch-number constraint in previous studies. We then extend the method to distributed streaming data under the master-client architecture, where batches are partitioned across sites and only summaries (gradient vectors) are exchanged. Instead of directing applying the popular method of Jordan et al. (2019) to the surrogate quadratic loss, our adjusted approach does not require the clients to compute the full surrogate loss. We derive non-asymptotic error bounds under the high-dimensional scaling, without the stringent constraint on the number of batches in the previous studies. Simulation results under linear and logistic models, together with a real-data application, show improved accuracy over existing renewable estimators.

artificial intelligence, machine learning, pkk, (17 more...)

arXiv.org Machine Learning

2606.11738

Country:

Asia > China (0.93)
Asia > Middle East > Jordan (0.25)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

Fixed-Parameter Tractability of Private Synthetic Data Generation

Ghazi, Badih, Guzmán, Cristóbal, Kamath, Pritish, Knop, Alexander, Kumar, Ravi, Manurangsi, Pasin

arXiv.org Machine LearningJun-11-2026

We study the problem of generating synthetic data under differential privacy. We establish fixed-parameter tractability (FPT) for this problem where the parameter is the treewidth of the query family's incidence graph. Our algorithms attain optimal error rates across all regimes and are realized by two different approaches: the first is based on linear programming (LP) and the FPT of the separation problem for the LP dual; the second is based on a subsampled private multiplicative weights method, where we obtain FPT for sampling from Gibbs distributions. Both approaches are unified by a dynamic programming framework over a tree decomposition.

artificial intelligence, bayesian inference, machine learning, (19 more...)

arXiv.org Machine Learning

2606.11283

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Add feedback

Time Series Analysis in Machine Learning

Pagliaro, Antonio, Anzalone, Anna

arXiv.org Machine LearningJun-11-2026

Time series analysis is a fundamental component of machine learning, especially in astrophysics and cosmology where temporal data abound. This chapter provides a pedagogical review of time series analysis techniques from a machine learning perspective. We cover the basic concepts of time series (stationarity, autocorrelation, seasonality), classical statistical models (autoregressive, moving average, ARIMA, exponential smoothing, state-space models), and modern machine learning approaches. In particular, we discuss how traditional statistical methods lay the groundwork, and then explore machine learning methods for time series, including feature-based regression, tree-based ensemble methods, hidden Markov models, Gaussian processes, and deep learning models (recurrent neural networks, convolutional networks, transformers). Throughout, we illustrate with examples drawn from multiple domains (e.g. astronomy, weather forecasting, finance) to emphasize common principles. The goal is to equip readers with both the theoretical understanding and practical context to apply machine learning techniques for time series analysis in their research.

artificial intelligence, machine learning, time sery, (12 more...)

arXiv.org Machine Learning

2606.11746

Country: North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.26)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine (1.00)
Banking & Finance > Trading (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Time Series Analysis (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

CRUMB: Efficient Prior Fitted Network Inference via Distributionally Matched Context Batching

Heredge, Jamie, Villani, Mattia J., Deshpande, Pranav, Seshadri, Akshay, Kumar, Niraj

arXiv.org Machine LearningJun-11-2026

Prior-fitted networks (PFNs) are a promising class of tabular foundation models that perform in-context learning, whereby the entire labelled training set is supplied as context, and predictions for test queries are produced in a single forward pass. However, the quadratically scaling self-attention mechanism in many PFN architectures makes inference prohibitive for very large training datasets. We propose CRUMB (Clustered Retrieval Using Minimised-MMD Batching), a three-stage inference wrapper that (i) clusters the test queries, (ii) selects a small, distributionally matched training subset for each cluster by greedily minimising the maximum mean discrepancy (MMD), and (iii) runs exact PFN inference on each reduced-context batch. CRUMB is architecture-agnostic and requires no retraining. On the 51-dataset TabArena benchmark, evaluated across three PFN architectures (TabPFNv2, TabICLv1, TabICLv2), we show that CRUMB outperforms similar state-of-the-art context selection strategies. We also show that CRUMB is resilient to covariate drift, as the MMD-minimisation step naturally helps align the training context distribution to match the current test batch distributions.

artificial intelligence, machine learning, test point, (18 more...)

arXiv.org Machine Learning

2606.11473

Genre: Research Report > Experimental Study (1.00)

Industry:

Banking & Finance (0.46)
Education (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Magnitude-Based Features for Multispecies Spatial Data

Sollberger, Julia, Bull, Joshua, Kališnik, Sara, Stolz, Bernadette

arXiv.org Machine LearningJun-11-2026

Multispecies spatial data arise in many applications where interactions between different entities are central to system behaviour, including biomedical imaging, geospatial analysis, and species ecology. Despite their importance, relatively few quantitative tools exist to capture such interactions. In this work, we propose magnitude-based features for the analysis of multispecies spatial data. Magnitude is a real-valued invariant of finite metric spaces that can be interpreted as an effective number of points, incorporating both spatial configuration and scale. We develop global and local magnitude feature vectors and demonstrate their utility on synthetic tumour microenvironment data, and in tissue microarray data from human colorectal cancer samples. Locally, the method identifies distinct neighbourhood types and reveals spatial heterogeneity; in the model, this includes radial patterns associated with different qualitative outcomes of the simulations, while in the real-world data it reflects the importance of tertiary lymphoid structure-like interactions between B and T cell populations. Globally, the approach recovers known classifications of long-term simulation outcomes across parameter regimes in synthetic data, and suggests important roles for CD4+ T cells and CD163+ macrophages in distinguishing patients with favourable Crohn's like reactions from unfavourable diffuse immune infiltration. Together, these results suggest that magnitude-based features provide a powerful and flexible tool for the analysis of multispecies spatial data.

artificial intelligence, machine learning, spatial reasoning, (18 more...)

arXiv.org Machine Learning

2606.11775

Country: Europe (0.46)

Genre: Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.57)

Add feedback

Conformal Risk-Averse Decision Making with Action Conditional Guarantee

Zhu, Zihan, Kiyani, Shayan, Pappas, George, Hassani, Hamed

arXiv.org Machine LearningJun-11-2026

Reliable decision making pipelines powered by machine learning models require uncertainty quantification (UQ) methods that come with explicit safety guarantees. Conformal prediction provides such UQ by wrapping ML predictions into prediction sets, and recent work by Kiyani et al. (2025b) established that these sets can be translated into optimal risk-averse decision policies -- yet only inheriting marginal safety guarantees. We generalize and strengthen their results by (i) introducing action-conditional conformal prediction, which yields safety guarantees conditioned explicitly on each action taken by the decision maker, (ii) showing that action-conditional prediction sets serve as a proxy for the feasible decision space for risk-averse decision makers aiming to optimize action-conditional value-at-risk, and (iii) proposing a principled finite-sample algorithm based on pinball-loss minimization, connecting the framework of Gibbs et al. (2025) to action-conditional guarantees. Experiments on two real-world datasets confirm that our approach significantly improves action-conditional performance over conformal baselines.

data mining, decision support system, machine learning, (18 more...)

arXiv.org Machine Learning

2606.05551

Country: North America > United States > California (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Decision Support Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback