Dover
Hyperbolic Image-Text Representations
Desai, Karan, Nickel, Maximilian, Rajpurohit, Tanmay, Johnson, Justin, Vedantam, Ramakrishna
Visual and linguistic concepts naturally organize themselves in a hierarchy, where a textual concept "dog" entails all images that contain dogs. Despite being intuitive, current large-scale vision and language models such as CLIP do not explicitly capture such hierarchy. We propose MERU, a contrastive model that yields hyperbolic representations of images and text. Hyperbolic spaces have suitable geometric properties to embed tree-like data, so MERU can better capture the underlying hierarchy in image-text datasets. Our results show that MERU learns a highly interpretable and structured representation space while being competitive with CLIP's performance on standard multi-modal tasks like image classification and image-text retrieval.
Launching Apple, Gmail, And A Harvard-IBM Robot Super-Brain
This week's milestones in the history of technology include the birth of Apple Computer, the first release of Gmail, and IBM signing an agreement with Harvard to build one of the earliest computers, the Automatic Sequence Controlled Calculator (ASCC), later called Mark I. Guglielmo Marconi receives the first wireless signal transmitted across the English Channel, sent from Wimereux, France, to his ship-to-shore station at the South Foreland Lighthouse outside Dover, England. The signal was a test held at the request of the French Government which was considering licensing the invention in France. Bell Telephone Laboratories announces the invention of the phototransistor, a transistor operated by light rather than electric current, invented by John Northrup Shive. An entirely new type of "electric eye" much smaller and sturdier than present photo-electric cells and possibly cheaper-has been invented at the Laboratories. During the past quarter century, electric eyes have found widespread use in electronics because of their ability to control electric currents by the action of light.