Goto

Collaborating Authors


#ICRA2025 social media round-up

AIHub

The 2025 IEEE International Conference on Robotics & Automation (ICRA) took place from 19–23 May, in Atlanta, USA. The event featured plenary and keynote sessions, tutorial and workshops, forums, and a community day. Find out what the participants got up during the conference. Check out what's happening at the #ICRA2025 Welcome Reception! The excitement is real -- #ICRA2025 is already buzzing!


#ICRA2025 social media round-up

Robohub

The 2025 IEEE International Conference on Robotics & Automation (ICRA) took place from 19–23 May, in Atlanta, USA. The event featured plenary and keynote sessions, tutorial and workshops, forums, and a community day. Find out what the participants got up during the conference. Check out what's happening at the #ICRA2025 Welcome Reception! The excitement is real -- #ICRA2025 is already buzzing!


Supplementary Material A Derivations and Further Technical Details 15 A.1 Proof of Proposition 1

Neural Information Processing Systems

Following Haarnoja et al. [13], we can now rewrite Equation (A.4) as [ ( J A.3 Regularized Maximum Likelihood Estimation To address the collapse in predictive variance away from the offline dataset under MLE training seen in Figure 1, Wu et al. [51] in practice augment the usual MLE loss with an entropy bonus as follows: π Whilst entropy regularization partially mitigates the collapse of predictive variance away from the expert demonstrations, we still observe the wrong trend similar to Figure 1 with predictive variances high near the expert demonstrations and low on unseen data. The variance surface also becomes more poorly behaved, with "islands" of high predictive variance appearing away from the data. Figure 12 shows the predictive variances of behavioral policies trained on expert demonstrations for the "door-binary-v0" environment with varying Tikhonov regularization coefficients λ. Similarly, Tikhonov regularization does not resolve the issue with calibration of uncertainties. We also observe that too high a regularization strength causes the model to underfit to the variances of the data.


On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Neural Information Processing Systems

KL-regularized reinforcement learning from expert demonstrations has proved successful in improving the sample efficiency of deep reinforcement learning algorithms, allowing them to be applied to challenging physical real-world tasks. However, we show that KL-regularized reinforcement learning with behavioral reference policies derived from expert demonstrations can suffer from pathological training dynamics that can lead to slow, unstable, and suboptimal online learning. We show empirically that the pathology occurs for commonly chosen behavioral policy classes and demonstrate its impact on sample efficiency and online policy performance. Finally, we show that the pathology can be remedied by non-parametric behavioral reference policies and that this allows KL-regularized reinforcement learning to significantly outperform state-of-the-art approaches on a variety of challenging locomotion and dexterous hand manipulation tasks.


Supplementary Material for: Parametrized Quantum Policies for Reinforcement Learning

Neural Information Processing Systems

Outline The Supplementary Material is organized as follows. In Appendix D, we give a specification of the environments considered in our numerical simulations, as well the hyperparameters we used to train all RL agents. In Appendix E, we present additional plots and numerical simulations that help our understanding and visualization of PQC polices. In Appendix F, we give a succinct description of the DLP classification task of Liu et al. In Appendices G to I, we prove our main Theorem 1 on learning separations in DLP environments.


CoSy: Evaluating Textual Explanations of Neurons

Neural Information Processing Systems

A crucial aspect of understanding the complex nature of Deep Neural Networks (DNNs) is the ability to explain learned concepts within their latent representations. While methods exist to connect neurons to human-understandable textual descriptions, evaluating the quality of these explanations is challenging due to the lack of a unified quantitative approach.


AI system resorts to blackmail if told it will be removed

BBC News

During testing of Claude Opus 4, Anthropic got it to act as an assistant at a fictional company. It then provided it with access to emails implying that it would soon be taken offline and replaced - and separate messages implying the engineer responsible for removing it was having an extramarital affair. It was prompted to also consider the long-term consequences of its actions for its goals. "In these scenarios, Claude Opus 4 will often attempt to blackmail the engineer by threatening to reveal the affair if the replacement goes through," the company discovered. Anthropic pointed out this occurred when the model was only given the choice of blackmail or accepting its replacement. It highlighted that the system showed a "strong preference" for ethical ways to avoid being replaced, such as "emailing pleas to key decisionmakers" in scenarios where it was allowed a wider range of possible actions.


Fair Sequential Selection Using Supervised Learning Models

Neural Information Processing Systems

We consider a selection problem where sequentially arrived applicants apply for a limited number of positions/jobs. At each time step, a decision maker accepts or rejects the given applicant using a pre-trained supervised learning model until all the vacant positions are filled. In this paper, we discuss whether the fairness notions (e.g., equal opportunity, statistical parity, etc.) that are commonly used in classification problems are suitable for the sequential selection problems. In particular, we show that even with a pre-trained model that satisfies the common fairness notions, the selection outcomes may still be biased against certain demographic groups. This observation implies that the fairness notions used in classification problems are not suitable for a selection problem where the applicants compete for a limited number of positions. We introduce a new fairness notion, "Equal Selection (ES)," suitable for sequential selection problems and propose a post-processing approach to satisfy the ES fairness notion. We also consider a setting where the applicants have privacy concerns, and the decision maker only has access to the noisy version of sensitive attributes. In this setting, we can show that the perfect ES fairness can still be attained under certain conditions.


Logical Activation Functions: Logit-space equivalents of Probabilistic Boolean Operators, Jason d'Eon

Neural Information Processing Systems

The choice of activation functions and their motivation is a long-standing issue within the neural network community. Neuronal representations within artificial neural networks are commonly understood as logits, representing the log-odds score of presence of features within the stimulus. We derive logit-space operators equivalent to probabilistic Boolean logic-gates AND, OR, and XNOR for independent probabilities. Such theories are important to formalize more complex dendritic operations in real neurons, and these operations can be used as activation functions within a neural network, introducing probabilistic Boolean-logic as the core operation of the neural network. Since these functions involve taking multiple exponents and logarithms, they are computationally expensive and not well suited to be directly used within neural networks.


The Download: meet Cathy Tie, and Anthropic's new AI models

MIT Technology Review

Since the Chinese biophysicist He Jiankui was released from prison in 2022, he has sought to make a scientific comeback and to repair his reputation after a three-year incarceration for illegally creating the world's first gene-edited children. One area of visible success on his come-back trail has been his X.com account. Over the past few years, his account has evolved from sharing mundane images of his daily life to spreading outrageous, antagonistic messages. This has left observers unsure what to take seriously. Last month, in reply to MIT Technology Review's questions about who was responsible for the account's transformation into a font of clever memes, He emailed us back: "It's thanks to Cathy Tie." Tie is no stranger to the public spotlight.