Goto

Collaborating Authors

 belongie


Eyeball reflections can reveal a 3D model of what you are looking at

New Scientist

Your eyes can reveal more than you might think, as researchers can now use computer vision technology to reconstruct 3D images of a scene from the reflections on a person's eyeballs. Jia-Bin Huang and his colleagues at the University of Maryland, College Park, developed a computer vision model that takes between five and 15 digital photographs from different angles of an individual's face while they look at a scene, and reconstructs that scene from the reflections in their eyes. The method adapts a technique called neural radiance fields (NeRF), which uses neural networks to determine the density and colour of objects the computer "sees". NeRF usually operates by directly looking at a scene, rather than viewing one reflected in a person's eyeballs. Huang's version builds the scene by extrapolating from a square of, on average, 20 by 20 pixels in each eye.


Serge Belongie Appointed Andrew H. and Ann R. Tisch Chaired Professor at Cornell Tech

Cornell Computer Science

Serge Belongie, member of the Computer Science department and Associate Dean at Cornell Tech, has been named Andrew H. and Ann R. Tisch Professor. In response to his investiture as an endowed chair, which began on April 1st, Belongie says "I wish I had the words to express my gratitude for this remarkable honor." In his capacity as Associate Dean at Cornell Tech, Belongie is "busy with coronavirus pandemic-related planning for Fall semester course offerings." As professor, he is working on "growing our cross-campus research efforts in Mixed Reality." The latter initiative "gathers efforts from across Cornell's campuses that relate to augmented and virtual reality, and their core disciplines of computer vision, computer graphics, and human-computer interaction."


Context Embedding Networks

Kim, Kun ho, Mac Aodha, Oisin, Perona, Pietro

arXiv.org Machine Learning

Low dimensional embeddings that capture the main variations of interest in collections of data are important for many applications. One way to construct these embeddings is to acquire estimates of similarity from the crowd. However, similarity is a multi-dimensional concept that varies from individual to individual. Existing models for learning embeddings from the crowd typically make simplifying assumptions such as all individuals estimate similarity using the same criteria, the list of criteria is known in advance, or that the crowd workers are not influenced by the data that they see. To overcome these limitations we introduce Context Embedding Networks (CENs). In addition to learning interpretable embeddings from images, CENs also model worker biases for different attributes along with the visual context i.e. the visual attributes highlighted by a set of images. Experiments on two noisy crowd annotated datasets show that modeling both worker bias and visual context results in more interpretable embeddings compared to existing approaches.