AITopics | Industry

Collaborating Authors

Industry

Learning Across the Gap: Hybrid Multi-armed Bandits with Heterogeneous Offline and Online Data

Neural Information Processing SystemsJun-15-2026, 11:10:03 GMT

The multi-armed bandit (MAB) is a fundamental online decision-making framework that has been extensively studied over the past two decades. To mitigate the high cost and slow convergence of purely online learning, modern MAB approaches have explored hybrid paradigms that leverage offline data to warm-start online learning. However, existing approaches face a significant limitation by assuming that the offline and online data are homogeneous--they share the same feedback structure and are drawn from the same underlying distribution. This assumption is often violated in practice, where offline data often originate from diverse sources and evolving environments, resulting in feedback heterogeneity and distributional shifts. In this work, we tackle the challenge of learning across this offline-online gap by developing a general hybrid bandit framework that incorporates heterogeneous offline data to improve online performance. We study two hybrid settings: (1) using reward-based offline data to accelerate online learning in preference-based bandits (i.e., dueling bandits), and (2) using preference-based offline data to improve online standard MAB algorithms. For both settings, we design novel algorithms and derive tight regret bounds that match or improve upon existing benchmarks despite heterogeneity. Empirical evaluations on both synthetic and real-world datasets further show the superior performance of our proposed methods over baseline algorithms.

data mining, machine learning, reinforcement learning, (21 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Overview (0.87)

Industry: Education > Educational Setting > Online (0.74)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)

Add feedback

MathArena: Evaluating LLMs on Uncontaminated Math Competitions

Neural Information Processing SystemsJun-15-2026, 11:08:54 GMT

The rapid advancement of reasoning capabilities in large language models (LLMs) has led to notable improvements on mathematical benchmarks. However, many of the most commonly used evaluation datasets (e.g., AIME 2024) are widely available online, making it difficult to disentangle genuine reasoning from potential memorization. Furthermore, these benchmarks do not evaluate proof-writing capabilities, which are crucial for many mathematical tasks. To address this, we introduce MATHARENA, a new benchmark based on the following key insight: recurring math competitions provide a stream of high-quality, challenging problems that can be used for real-time evaluation of LLMs. By evaluating models as soon as new problems are released, we effectively eliminate the risk of contamination.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

Add feedback

BikeBench: ABicycle Design Benchmark for Generative Models with Objectives and Constraints

Neural Information Processing SystemsJun-15-2026, 11:08:37 GMT

We introduce BikeBench, an engineering design benchmark for evaluating generative models on problems with multiple real-world objectives and constraints. As generative AI's reach continues to grow, evaluating its capability to understand physical laws, human guidelines, and hard constraints grows increasingly important. Engineering product design lies at the intersection of these difficult tasks, providing new challenges for AI capabilities. BikeBench evaluates AI models' capabilities to generate bicycle designs that not only resemble the dataset, but meet specific performance objectives and constraints. To do so, BikeBench quantifies a variety of human-centered and multiphysics performance characteristics, such as aerodynamics, ergonomics, structural mechanics, human-rated usability, and similarity to subjective text or image prompts. Supporting the benchmark are several datasets of simulation results, a dataset of 10,000 human-rated bicycle assessments, and a synthetically generated dataset of 1.6M designs, each with a parametric, CAD/XML, SVG, and PNG representation. BikeBench is uniquely configured to evaluate tabular generative models, large language models (LLMs), design optimization, and hybrid algorithms side-by-side. Our experiments indicate that LLMs and tabular generative models fall short of hybrid GenAI+optimization algorithms in design quality, constraint satisfaction, and similarity scores, suggesting significant room for improvement. We hope that BikeBench, a first-of-its-kind benchmark, will help catalyze progress in generative AI for constrained multi-objective engineering design problems.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (0.67)

Industry:

Automobiles & Trucks (0.69)
Health & Medicine (0.46)
Leisure & Entertainment > Sports > Cycling (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.69)

Add feedback

Learned

Neural Information Processing SystemsJun-15-2026, 11:07:37 GMT

The quality of foundation models depends heavily on their training data. Consequently, great efforts have been put into dataset curation. Yet most approaches rely on manual tuning of coarse-grained mixtures of large buckets of data, or filtering by hand-crafted heuristics. An approach that is ultimately more scalable (let alone more satisfying) is to learn which data is actually valuable for training. This type of meta-learning could allow more sophisticated, fine-grained, and effective curation. Our proposed DataRater is an instance of this idea. It estimates the value of training on any particular data point. This is done by meta-learning using'meta-gradients', with the objective of improving training efficiency on held out data. In extensive experiments across a range of model scales and datasets, we find that using our DataRater to filter data is highly effective, resulting in significantly improved compute efficiency.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.92)
North America > United States > New York (0.28)
North America > United States > Minnesota (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Multidimensional Bayesian Utility Maximization: Tight Approximations to Welfare

Neural Information Processing SystemsJun-15-2026, 11:07:18 GMT

We initiate the study of multidimensional Bayesian utility maximization, focusing on the unit-demand setting where values are i.i.d.

artificial intelligence, machine learning, mechanism, (20 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Game Theory (0.68)

Add feedback

The Resurgence of Bracero Logic

TIME - TechJun-15-2026, 11:00:17 GMT

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?

advertisement, artificial intelligence, bracero program, (14 more...)

TIME - Tech

Country: North America > United States (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Immigration & Customs (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.42)

Add feedback

Windows 11's June update makes PCs feel faster. Here's what's new

PCWorldJun-15-2026, 11:00:00 GMT

PCWorld reports that Windows 11's June 2026 update (KB5094126) introduces performance improvements including a Low Latency Profile that boosts CPU speeds for faster app launches. Key new features include simultaneous webcam access for multiple apps, Shared Audio for connecting two Bluetooth headsets, and enhanced Windows Search requiring only two characters. The update also adds NPU usage monitoring in Task Manager for compatible PCs, making the overall system feel more responsive. On Patch Tuesday in June 2026, Microsoft not only released the long-awaited Low Latency Profile for Windows 11, but also introduced Shared Audio and several other improvements as part of update KB5094126.

artificial intelligence, digital magazine, home robotic performance privacy productivity, (10 more...)

PCWorld

Industry:

Information Technology > Security & Privacy (1.00)
Leisure & Entertainment > Games > Computer Games (0.55)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Robots (0.55)

Add feedback

Doctor Approved: Generating Medically Accurate Skin Disease Images through AI-Expert Feedback

Neural Information Processing SystemsJun-15-2026, 10:57:03 GMT

Paucity of medical data severely limits the generalizability of diagnostic ML models, as the full spectrum of disease variability can not be represented by a small clinical dataset. To address this, diffusion models (DMs) have been considered as a promising avenue for synthetic image generation and augmentation. However, they frequently produce medically inaccurate images, deteriorating the model performance. Expert domain knowledge is critical for synthesizing images that correctly encode clinical information, especially when data is scarce and quality outweighs quantity. Existing approaches for incorporating human feedback, such as reinforcement learning (RL) and Direct Preference Optimization (DPO), rely on robust reward functions or demand labor-intensive expert evaluations. Recent progress in Multimodal Large Language Models (MLLMs) reveals their strong visual reasoning capabilities, making them adept candidates as evaluators. In this work, we propose a novel framework, coined MAGIC (Medically Accurate Generation of Images through AI-Expert Collaboration), that synthesizes clinically accurate skin disease images for data augmentation.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: North America > United States (0.67)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Dermatology (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Benchmarking End-To-End Performance of AI-Based Chip Placement Algorithms

Neural Information Processing SystemsJun-15-2026, 10:56:09 GMT

Chip placement is a critical step in the Electronic Design Automation (EDA) workflow, which aims to arrange chip modules on the canvas to optimize the performance, power, and area (PPA) metrics of final designs. Recent advances show great potential of AI-based algorithms in chip placement. However, due to the lengthy EDA workflow, evaluations of these algorithms often focus on intermediate surrogate metrics, which are computationally efficient but often misalign with the final end-to-end performance (i.e., the final design PPA). To address this challenge, we propose to build ChiPBench, a comprehensive benchmark specifically designed to evaluate the effectiveness of AI-based algorithms in final design PPA metrics. Specifically, we generate a diverse evaluation dataset from 20circuits across various domains, such as CPUs, GPUs, and NPUs. We then evaluate six state-of-the-art AI-based chip placement algorithms on the dataset and conduct a thorough analysis of their placement behavior. Extensive experiments show that AI-based chip placement algorithms produce unsatisfactory final PPA results, highlighting the significant influence of often-overlooked factors like regularity and dataflow. We believe ChiPBench will effectively bridge the gap between academia and industry.

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia (0.46)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Semiconductors & Electronics (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

World Cup racism monitor urges FIFA to remove VAR official over gesture

Al JazeeraJun-15-2026, 10:55:55 GMT

FIFA's discrimination monitor at the World Cup has called for a VAR official to be removed for appearing to make a hand gesture resembling a white supremacist sign. When the official broadcast of Germany's opening game against Curacao on Sunday cut pre-game to show the team of video review analysts, Shaun Evans from Australia made an "OK" symbol with his right hand in front of his right leg. Though the game was played in Houston, video officials work in Dallas at the World Cup broadcast centre. "Advice from our experts is that the gesture used clearly resembles an upside down'OK' hand symbol used as a'white power' symbol in global far-right circles," the Fare network, a long-time partner of FIFA and European football body UEFA to monitor racist and discriminatory chants, flags and symbols at international games, said in a statement. "Clearly this official should have no further role to play in this World Cup," Fare said in a statement, describing the gesture as "neo-Nazi".

artificial intelligence, live navigation menu news show, news section africa asia us, (6 more...)

Al Jazeera

Country:

North America > United States (0.67)
North America > Canada (0.43)
North America > Central America (0.41)

Industry: Leisure & Entertainment > Sports > Soccer (1.00)

Technology: Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.61)

Add feedback