Goto

Collaborating Authors

 School Nutrition


'It's survival of the fittest': the UK kebab chain seeking an edge with robot slicers

The Guardian

'People are being more discerning about spending money,' he says. 'People are being more discerning about spending money,' he says. T hey are already packing our groceries and delivering shopping. Now robots are coming to the kebab shop, alongside self-service screens and loyalty apps, as takeaways look for ways to tackle rising costs. German Doner Kebab (GDK), a perhaps surprisingly British-owned chain that has been springing up across the country, has turned to technology to keep its fast food business buzzing in the face of rising costs and tough times on the high street.









Provable Offline Reinforcement Learning for Structured Cyclic MDPs

Lee, Kyungbok, Sarteau, Angelica Cristello, Kosorok, Michael R.

arXiv.org Machine Learning

We introduce a novel cyclic Markov decision process (MDP) framework for multi-step decision problems with heterogeneous stage-specific dynamics, transitions, and discount factors across the cycle. In this setting, offline learning is challenging: optimizing a policy at any stage shifts the state distributions of subsequent stages, propagating mismatch across the cycle. To address this, we propose a modular structural framework that decomposes the cyclic process into stage-wise sub-problems. While generally applicable, we instantiate this principle as CycleFQI, an extension of fitted Q-iteration enabling theoretical analysis and interpretation. It uses a vector of stage-specific Q-functions, tailored to each stage, to capture within-stage sequences and transitions between stages. This modular design enables partial control, allowing some stages to be optimized while others follow predefined policies. We establish finite-sample suboptimality error bounds and derive global convergence rates under Besov regularity, demonstrating that CycleFQI mitigates the curse of dimensionality compared to monolithic baselines. Additionally, we propose a sieve-based method for asymptotic inference of optimal policy values under a margin condition. Experiments on simulated and real-world Type 1 Diabetes data sets demonstrate CycleFQI's effectiveness.


A Data Analysis The LoRA Dataset Project page:https: //lora-vqa.github.io/

Neural Information Processing Systems

Each question and answer group has a unique list of corresponding visuals used for image creation. The list of visible objects, which combines the correct-answer objects with an arbitrary'noise' object