To_The_Point__Correspondence_driven_self_supervised_3D_reconstruction.pdf

Apr-25-2026, 14:36:12 GMT–Neural Information Processing Systems

Every image is encoded using an ImageNet pre-trained ResNet18 to a latent feature map z R4 4 256. A flattened version of z is processed with one linear layer with output channels equal to N 3to get the predictions for points u and visibility v. We apply the sigmoid function to the visibility predictions v to enforce a numerical range [0,1]. Our models are trained using Adam optimizer with learning rate equal to 1e-4. In detail, scale is sampled from the range [0.7, 1.2], vertical translation is up to 38 pixels and we also apply 2D rotation up to 40 degrees. For camera equivariance the image is simply flipped horizontally and given as input to the network to estimate the pose.

artificial intelligence, deformation, machine learning, (17 more...)

Neural Information Processing Systems

Apr-25-2026, 14:36:12 GMT

Conferences PDF

Add feedback

Technology:
- Information Technology > Artificial Intelligence > Machine Learning (1.00)

Duplicate Docs Excel Report

Title
To_The_Point__Correspondence_driven_self_supervised_3D_reconstruction.pdf

Similar Docs Excel Report more

Title	Similarity	Source
None found