Instructional Material
Particle Dynamics for Latent-Variable Energy-Based Models
Tang, Shiqin, Zhuang, Shuxin, Feng, Rong, Yu, Runsheng, Li, Hongzong, Zhang, Youzhi
Latent-variable energy-based models (LV-EBMs) assign a single normalized energy to joint pairs of observed data and latent variables, offering expressive generative modeling while capturing hidden structure. We recast maximum-likelihood training as a saddle problem over distributions on the latent and joint manifolds and view the inner updates as coupled Wasserstein gradient flows. The resulting algorithm alternates overdamped Langevin updates for a joint negative pool and for conditional latent particles with stochastic parameter ascent, requiring no discriminator or auxiliary networks. We prove existence and convergence under standard smoothness and dissi-pativity assumptions, with decay rates in KL divergence and Wasserstein-2 distance. The saddle-point view further yields an ELBO strictly tighter than bounds obtained with restricted amortized posteriors. Our method is evaluated on numerical approximations of physical systems and performs competitively against comparable approaches.
Nonlinear Dimensionality Reduction Techniques for Bayesian Optimization
Long, Luo, Cartis, Coralia, Shustin, Paz Fink
Bayesian optimisation (BO) is a standard approach for sample-efficient global optimisation of expensive black-box functions, yet its scalability to high dimensions remains challenging. Here, we investigate nonlinear dimensionality reduction techniques that reduce the problem to a sequence of low-dimensional Latent-Space BO (LSBO). While early LSBO methods used (linear) random projections (Wang et al., 2013), building on Grosnit et al. (2021), we employ Variational Autoencoders (VAEs) for LSBO, focusing on deep metric loss for structured latent manifolds and VAE retraining to adapt the encoder-decoder to newly sampled regions. We propose some changes in their implementation, originally designed for tasks such as molecule generation, and reformulate the algorithm for broader optimisation purposes. We then couple LSBO with Sequential Domain Reduction (SDR) directly in the latent space (SDR-LSBO), yielding an algorithm that narrows the latent search domains as evidence accumulates. Implemented in a GPU-accelerated BoTorch stack with Matern-5/2 Gaussian process surrogates, our numerical results show improved optimisation quality across benchmark tasks and that structured latent manifolds improve BO performance. Additionally, we compare random embeddings and VAEs as two mechanisms for dimensionality reduction, showing that the latter outperforms the former. To the best of our knowledge, this is the first study to combine SDR with VAE-based LSBO, and our analysis clarifies design choices for metric shaping and retraining that are critical for scalable latent space BO. For reproducibility, our source code is available at https://github.com/L-Lok/Nonlinear-Dimensionality-Reduction-Techniques-for-Bayesian-Optimization.git.
4 ways to fix 'tech neck,' according to a physical therapist
Strengthening can help if you're staring at your phone too much. You don't need a ton of equipment to fix your neck. Breakthroughs, discoveries, and DIY tips sent every weekday. If you're here seeking relief from tech neck, or the forward head posture associated with the use of personal devices, we've got good and bad news. The good news is you've come to the right place; the bad news is you're probably contributing to it right now.
Inside San Francisco's new AI school: is this the future of US education?
Experts have raised questions about whether an app-based curriculum can serve all learners equally. Experts have raised questions about whether an app-based curriculum can serve all learners equally. Inside San Francisco's new AI school: is this the future of US education? In the world's tech innovation epicenter, an "AI-powered" private school has made headlines for unabashedly embracing the technology. Alpha School San Francisco, which opened its doors to K-8 students this fall, is the newest outpost of a network of 14 nationwide private schools.
One-Step Flow Policy Mirror Descent
Chen, Tianyi, Ma, Haitong, Li, Na, Wang, Kai, Dai, Bo
Diffusion policies have achieved great success in online reinforcement learning (RL) due to their strong expressive capacity. However, the inference of diffusion policy models relies on a slow iterative sampling process, which limits their responsiveness. To overcome this limitation, we propose Flow Policy Mirror Descent (FPMD), an online RL algorithm that enables 1-step sampling during flow policy inference. Our approach exploits a theoretical connection between the distribution variance and the discretization error of single-step sampling in straight interpolation flow matching models, and requires no extra distillation or consistency training. We present two algorithm variants based on rectified flow policy and MeanFlow policy, respectively. Extensive empirical evaluations on MuJoCo and visual DeepMind Control Suite benchmarks demonstrate that our algorithms show strong performance comparable to diffusion policy baselines while requiring orders of magnitude less computational cost during inference. Diffusion models have established themselves as the state-of-the-art paradigm in generative modeling (Ho et al., 2020; Dhariwal & Nichol, 2021), capable of synthesizing data of unparalleled quality and diversity across various modalities, including images, audio, and video. The success is rooted in a principled, thermodynamically-inspired framework that learns to reverse a gradual noising process (Sohl-Dickstein et al., 2015).
Absolute Zero: Reinforced Self-play Reasoning with Zero Data
Zhao, Andrew, Wu, Yiran, Yue, Yang, Wu, Tong, Xu, Quentin, Yue, Yang, Lin, Matthieu, Wang, Shenzhi, Wu, Qingyun, Zheng, Zilong, Huang, Gao
Reinforcement learning with verifiable rewards (RLVR) has shown promise in enhancing the reasoning capabilities of large language models by learning directly from outcome-based rewards. Recent RLVR works that operate under the zero setting avoid supervision in labeling the reasoning process, but still depend on manually curated collections of questions and answers for training. The scarcity of high-quality, human-produced examples raises concerns about the long-term scalability of relying on human supervision, a challenge already evident in the domain of language model pretraining. Furthermore, in a hypothetical future where AI surpasses human intelligence, tasks provided by humans may offer limited learning potential for a superintelligent system. To address these concerns, we propose a new RLVR paradigm called Absolute Zero, in which a single model learns to propose tasks that maximize its own learning progress and improves reasoning by solving them, without relying on any external data. Under this paradigm, we introduce the Absolute Zero Reasoner (AZR), a system that self-evolves its training curriculum and reasoning ability by using a code executor to both validate proposed code reasoning tasks and verify answers, serving as an unified source of verifiable reward to guide open-ended yet grounded learning. Despite being trained entirely without external data, AZR achieves overall SOTA performance on coding and mathematical reasoning tasks, outperforming existing zero-setting models that rely on tens of thousands of in-domain human-curated examples. Furthermore, we demonstrate that AZR can be effectively applied across different model scales and is compatible with various model classes.
Restoring Noisy Demonstration for Imitation Learning With Diffusion Models
Chen, Shang-Fu, Yong, Co, Sun, Shao-Hua
Abstract--Imitation learning (IL) aims to learn a policy from expert demonstrations and has been applied to various applications. By learning from the expert policy, IL methods do not require environmental interactions or reward signals. However, most existing imitation learning algorithms assume perfect expert demonstrations, but expert demonstrations often contain imperfections caused by errors from human experts or sensor/control system inaccuracies. T o address the above problems, this work proposes a filter-and-restore framework to best leverage expert demonstrations with inherent noise. Our proposed method first filters clean samples from the demonstrations and then learns conditional diffusion models to recover the noisy ones. We evaluate our proposed framework and existing methods in various domains, including robot arm manipulation, dexterous manipulation, and locomotion. The experiment results show that our proposed framework consistently outperforms existing methods across all the tasks. Ablation studies further validate the effectiveness of each component and demonstrate the framework's robustness to different noise types and levels. These results confirm the practical applicability of our framework to noisy offline demonstration data. MIT A TION learning [1]-[13] aims to learn a policy from expert demonstrations and has been applied to various applications, including robotics [8], industrial automation, strategy board games, video games, etc [14]-[19]. Compared to reinforcement learning (RL), acquiring a policy in a trial-and-error manner, which can be unsafe or expensive, imitation learning (IL) algorithms can learn without environmental interactions. Furthermore, while designing sophisticated RL reward functions is often difficult and tedious [20], [21], IL methods learn from expert demonstrations and do not require reward signals. Despite the wide applicability, most existing imitation learning algorithms assume perfect (i.e., optimal and clean) expert demonstrations, which can be challenging and expensive to collect. Specifically, expert demonstrations often contain imperfections caused by errors from human experts or sensor and control system inaccuracies.
Towards Neurocognitive-Inspired Intelligence: From AI's Structural Mimicry to Human-Like Functional Cognition
Golilarz, Noorbakhsh Amiri, Khatib, Hassan S. Al, Rahimi, Shahram
Artificial intelligence has advanced significantly through deep learning, reinforcement learning, and large language and vision models. However, these systems often remain task specific, struggle to adapt to changing conditions, and cannot generalize in ways similar to human cognition. Additionally, they mainly focus on mimicking brain structures, which often leads to black-box models with limited transparency and adaptability. Inspired by the structure and function of biological cognition, this paper introduces the concept of "Neurocognitive-Inspired Intelligence (NII)," a hybrid approach that combines neuroscience, cognitive science, computer vision, and AI to develop more general, adaptive, and robust intelligent systems capable of rapid learning, learning from less data, and leveraging prior experience. These systems aim to emulate the human brain's ability to flexibly learn, reason, remember, perceive, and act in real-world settings with minimal supervision. We review the limitations of current AI methods, define core principles of neurocognitive-inspired intelligence, and propose a modular, biologically inspired architecture that emphasizes integration, embodiment, and adaptability. We also discuss potential implementation strategies and outline various real-world applications, from robotics to education and healthcare. Importantly, this paper offers a hybrid roadmap for future research, laying the groundwork for building AI systems that more closely resemble human cognition.
AI-Agents for Culturally Diverse Online Higher Education Environments
Sun, Fuze, Craig, Paul, Li, Lingyu, Meng, Shixiangyue, Nan, Chuxi
As the global reach of online higher education continues to grow, universities are increasingly accommodating students from diverse cultural backgrounds (Tereshko et al., 2024). This can present a number of challenges including linguistic barriers (Ullah et al., 2021), cultural differences in learning style (Omidvar & Tan, 2012), cultural sensitivity in course design (Nguyen, 2022) and perceived isolation when students feel their perspectives or experiences are not reflected or valued in the learning environment (Hansen-Brown et al., 2022). Ensuring active engagement and reasonable learning outcomes in such a environments requires distance educational systems that are not only adaptive but also culturally resonant (Dalle et al., 2024). Both embodied and virtual AI-Agents have great potential in this regard as they can facilitate personalized learning and adapt their interactions and content delivery to align with students' cultural context. In addition, Generative AI (GAI), such as, Large Language Models (LLMs) can amplify the potential for these culturally aware AI agents to address educational challenges due to their advanced capacity for understanding and generating contextually relevant content (Wang et al., 2024). This chapter reviews existing research and suggests the usage of culturally aware AI-Agents, powered by GAI, to foster engagement and improve learning outcomes in culturally diverse online higher education environments.
Socratic Mind: Impact of a Novel GenAI-Powered Assessment Tool on Student Learning and Higher-Order Thinking
Lee, Jeonghyun, Hung, Jui-Tse, Soylu, Meryem Yilmaz, Popescu, Diana, Cui, Christopher Zhang, Grigoryan, Gayane, Joyner, David A, Harmon, Stephen W
This study examines the impact of Socratic Mind, a Generative Artificial Intelligence (GenAI) powered formative assessment tool that employs Socratic questioning to support student learning in a large, fully online undergraduate-level computing course. Employing a quasi-experimental, mixed-methods design, we investigated participants' engagement patterns, the influence of user experience on engagement, and impacts on both perceived and actual learning outcomes. Data were collected from the system logs, surveys on user experience and perceived engagement and learning gains, student reflections, and course performance data. Results indicated that participants consistently reported high levels of affective, behavioral, and cognitive engagement, and these were strongly linked to positive user experiences and perceived learning outcomes. Quantitative analysis further revealed that students who engaged with the GenAI tool experienced significant gains in their quiz scores compared to those who did not, particularly benefiting students with lower baseline achievement. Additionally, thematic analysis of qualitative feedback revealed substantial perceived improvements in higher-order thinking skills, including problem solving, critical thinking, and self-reflection. Our findings highlight the promise of AI-mediated dialogue in fostering deeper engagement and higher-order cognitive skills. As higher education institutions expand GenAI integration in curriculum, this dialogic, GenAI powered assessment tool can offer a scalable strategy to promote students' meaningful learning outcomes.