Goto

Collaborating Authors

 warp





2 Projectiononthe(n,k)-simplex Weconsiderthefollowingprojectionproblem: pα(z)=argmin

Neural Information Processing Systems

Usually, this is done by projecting the score vector onto a probability simplex, and such projections are often characterized as Lipschitz continuous approximations of the argmax function, whose Lipschitz constant is controlled by a parameter that is similar to a softmax temperature.


TraceTrans: Translation and Spatial Tracing for Surgical Prediction

Luo, Xiyu, Li, Haodong, Cheng, Xinxing, Zhao, He, Hu, Yang, Song, Xuan, Zhang, Tianyang

arXiv.org Artificial Intelligence

Image-to-image translation models have achieved notable success in converting images across visual domains and are increasingly used for medical tasks such as predicting post-operative outcomes and modeling disease progression. However, most existing methods primarily aim to match the target distribution and often neglect spatial correspondences between the source and translated images. This limitation can lead to structural inconsistencies and hallucinations, undermining the reliability and interpretability of the predictions. These challenges are accentuated in clinical applications by the stringent requirement for anatomical accuracy. In this work, we present TraceTrans, a novel deformable image translation model designed for post-operative prediction that generates images aligned with the target distribution while explicitly revealing spatial correspondences with the pre-operative input. The framework employs an encoder for feature extraction and dual decoders for predicting spatial deformations and synthesizing the translated image. The predicted deformation field imposes spatial constraints on the generated output, ensuring anatomical consistency with the source. Extensive experiments on medical cosmetology and brain MRI datasets demonstrate that TraceTrans delivers accurate and interpretable post-operative predictions, highlighting its potential for reliable clinical deployment.






StabStitch++: Unsupervised Online Video Stitching with Spatiotemporal Bidirectional Warps

Nie, Lang, Lin, Chunyu, Liao, Kang, Zhang, Yun, Liu, Shuaicheng, Zhao, Yao

arXiv.org Artificial Intelligence

-- We retarget video stitching to an emerging issue, named warping shake, which unveils the temporal content shakes induced by sequentially unsmooth warps when extending image stitching to video stitching. Even if the input videos are stable, the stitched video can inevitably cause undesired warping shakes and affect the visual experience. T o address this issue, we propose StabStitch++, a novel video stitching framework to realize spatial stitching and temporal stabilization with unsupervised learning simultaneously. First, different from existing learning-based image stitching solutions that typically warp one image to align with another, we suppose a virtual midplane between original image planes and project them onto it. Concretely, we design a differentiable bidirectional decomposition module to disentangle the homography transformation and incorporate it into our spatial warp, evenly spreading alignment burdens and projective distortions across two views. Then, inspired by camera paths in video stabilization, we derive the mathematical expression of stitching trajectories in video stitching by elaborately integrating spatial and temporal warps. Finally, a warp smoothing model is presented to produce stable stitched videos with a hybrid loss to simultaneously encourage content alignment, trajectory smoothness, and online collaboration. Compared with StabStitch that sacrifices alignment for stabilization, StabStitch++ makes no compromise and optimizes both of them simultaneously, especially in the online mode. T o establish an evaluation benchmark and train the learning framework, we build a video stitching dataset with a rich diversity in camera motions and scenes. NTRODUCTION Lang Nie, Chunyu Lin, and Y ao Zhao are with the Institute of Information Science, Beijing Jiaotong University, Beijing 100044, China, and also with Visual Intelligence +X International Cooperation Joint Laboratory of MOE, Beijing 100044, China (e-mail: nielang@bjtu.edu.cn, Kang Liao is with the School of Computing and Data Science, Nanyang Technological University, Singapore (e-mail: kang.liao@ntu.edu.sg). Y un Zhang is with the School of Media Engineering, Communication University of Zhejiang, Hangzhou 310018, China (e-mail: zhangyun@cuz.edu.cn). Shuaicheng Liu is with the School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, China (e-mail: liushuaicheng@uestc.edu.cn). This work was supported by the National Natural Science Foundation of China (NSFC) under Grants U2441242 and 62172032, as well as by the Open Fund of Zhejiang Key Laboratory of Film and TV Media Technology. IDEO stitching techniques are commonly employed to create panoramic or wide field-of-view (FoV) displays from different viewpoints with limited FoV .