Goto

Collaborating Authors

 Europe






ReMaX: Relaxing for Better Training on Efficient Panoptic Segmentation

Neural Information Processing Systems

This paper presents a new mechanism to facilitate the training of mask transformers for efficient panoptic segmentation, democratizing its deployment. We observe that due to the high complexity in the training objective of panoptic segmentation, it will inevitably lead to much higher penalization on false positive.






Preference Learning Algorithms Do Not Learn Preference Rankings

Neural Information Processing Systems

Preference learning algorithms (e.g., RLHF and DPO) are frequently used to steer LLMs to produce generations that are more preferred by humans, but our understanding of their inner workings is still limited.