Goto

Collaborating Authors

 google lens


Evaluating Precise Geolocation Inference Capabilities of Vision Language Models

Jay, Neel, Nguyen, Hieu Minh, Hoang, Trung Dung, Haimes, Jacob

arXiv.org Artificial Intelligence

The prevalence of Vision-Language Models (VLMs) raises important questions about privacy in an era where visual information is increasingly available. While foundation VLMs demonstrate broad knowledge and learned capabilities, we specifically investigate their ability to infer geographic location from previously unseen image data. This paper introduces a benchmark dataset collected from Google Street View that represents its global distribution of coverage. Foundation models are evaluated on single-image geolocation inference, with many achieving median distance errors of <300 km. We further evaluate VLM "agents" with access to supplemental tools, observing up to a 30.6% decrease in distance error. Our findings establish that modern foundation VLMs can act as powerful image geolocation tools, without being specifically trained for this task. When coupled with increasing accessibility of these models, our findings have greater implications for online privacy. We discuss these risks, as well as future work in this area.


How to use Visual Intelligence, Apple's take on Google Lens

Engadget

The recent rollout of iOS 18.2 finally brings many of the promised Apple Intelligence features, like Genmoji and Image Playground. One such long-awaited tool is Visual Intelligence, a feature currently reserved for the iPhone 16 Pro and Pro Max that was first introduced at the company's September event. Visual Intelligence is Apple's answer to Google Lens. It leverages the camera system and AI to analyze images in real-time and provide useful information. This can help people learn more about the world around them and is particularly handy for shopping, looking up details about a restaurant or business, translating written text, summarizing text or having something read aloud.


Google stuffs more AI into search

Engadget

Google is adding more AI to search. On Thursday, the company unveiled a long list of changes, including AI-organized web results, Google Lens updates (including video and voice) and placing links and ads inside AI Overviews. One can suspect that AI-organized search results are where Google will eventually move across the board, but the rollout starts with a narrow scope. Beginning with recipes and meal inspiration, Google's AI will create a "full-page experience" that includes relevant results based on your search. The company says the AI-collated pages will consist of "perspectives from across the web," like articles, videos and forums.


Google's Visual Search Can Now Answer Even More Complex Questions

WIRED

When Google Lens was introduced in 2017, the search feature accomplished a feat that not too long ago would have seemed like the stuff of science fiction: Point your phone's camera at an object and Google Lens can identify it, show some context, maybe even let you buy it. It was a new way of searching, one that didn't involve awkwardly typing out descriptions of things you were seeing in front of you. Lens also demonstrated how Google planned to use its machine learning and AI tools to ensure its search engine shows up on every possible surface. As Google increasingly uses its foundational generative AI models to generate summaries of information in response to text searches, Google Lens' visual search has been evolving, too. And now the company says Lens, which powers around 20 billion searches per month, is going to support even more ways to search, including video and multimodal searches.


Google will let you search your Chrome browsing history by asking questions like a human

Engadget

You're neck deep in a research project but the finish line is in sight. You hit the close button on your browser. It vanishes and takes the dozens of tabs you had open with it. You heave a sigh of relief -- and then remember that you need to verify just one more detail from one of the web pages you had open. The problem is that you have no idea which one it was or how to get back there.


Fitbit's health chatbot will arrive later this year

Engadget

Like most other corners of the tech world, Google sees AI powering the next innovations in health technology. The company's annual The Check Up event expanded on its plans to add a personal health chatbot to the Fitbit app, expand Google Lens for better skin condition searches and use a version of its Gemini chatbot in the medical field. One of the more intriguing of Google's announcements on Tuesday was more detail about an experimental AI feature for Fitbit users, briefly teased last year. Fitbit Labs will let owners draw correlations and "connect the dots" from health data tracked using their wearable devices. A chatbot in the mobile app will let you ask questions in natural language and create personalized charts to learn about your health.


Google can now translate text from images on the web

Engadget

Google Translate on the web can now convert text from images. It uses the same tech as the AR Translate tool for Google Lens, which performs real-time translations on smartphones. You'll find the option on the Google Translate website, where you'll see a new Images tab at the top. After uploading a photo or screenshot from your computer, a translation appears that (in most cases) should look about as seamless as the original text. The web interface includes options to copy the text, download the translated image or clear it.


The Search Engine Showdown is Far from Over

#artificialintelligence

Back in the 1990s, the search engine category was a hot space. Yahoo, Netscape, AOL, Ask Jeeves, AltaVista, Google search, MSN and others were vying to capture the dominant position. With time, they all fizzled out. Post 2000 was the era of Google Search, the undisputed winner of the space until quite recently. The tide is turning and the crown of Google Search is under threat.


Machine translation for medical chat, checkpoint #2

#artificialintelligence

I've made some progress since my previous post on machine translation for medical chat, and this is a second checkpoint. I visited a friend in Tokyo for a week and my Japanese proficiency is extremely limited, mainly coming from Duolingo and the little I remember from anime. While I was there, I kept up with my Duolingo practice and relied heavily on Google Lens, which translates text in images. Google Lens was great, and was fast with offline models. It was particularly good for translating signs, such as those in parks or tourist areas.


Lens AI Is Now Used Everywhere For Google Image Search

#artificialintelligence

Google Lens has been around for some time now as the search giant's de facto AI search for images and image-based text. Now, following rumors that suggested Lens for desktop platforms might be coming, searching Google via an image upload uses the Assistant-related feature too. That's based on recent reports following a roll-out on the company's search page. For clarity, that's searches found at images.google.com. The site is effectively Google's solution for reverse searching images.