Goto

Collaborating Authors

 cohn


Greenland 'will stay Greenland', former Trump adviser declares

BBC News

Greenland'will stay Greenland', former Trump adviser declares Donald Trump will not be able to force Greenland to change ownership, a former top adviser to the US president has told the BBC. IBM's vice chairman Gary Cohn, who advised Trump on the economy in his first term, said Greenland will stay Greenland and linked the need for access to critical minerals to his former boss's plans for the territory. Cohn is one of America's top tech bosses, a leader in the race to develop AI and quantum computing, and served under Trump as director of the White House National Economic Council. In a sign of how seriously business leaders are taking the crisis, he warned invading an independent country that is part of Nato would be over the edge. He also suggested the president's recent comments about Greenland may be part of a negotiation.


ComicScene154: A Scene Dataset for Comic Analysis

arXiv.org Artificial Intelligence

Comics offer a compelling yet under-explored domain for computational narrative analysis, combining text and imagery in ways distinct from purely textual or audiovisual media. We introduce ComicScene154, a manually annotated dataset of scene-level narrative arcs derived from public-domain comic books spanning diverse genres. By conceptualizing comics as an abstraction for narrative-driven, multimodal data, we highlight their potential to inform broader research on multi-modal storytelling. To demonstrate the utility of ComicScene154, we present a baseline scene segmentation pipeline, providing an initial benchmark that future studies can build upon. Our results indicate that ComicScene154 constitutes a valuable resource for advancing computational methods in multimodal narrative understanding and expanding the scope of comic analysis within the Natural Language Processing community.


Can Large Language Models Reason about the Region Connection Calculus?

arXiv.org Artificial Intelligence

Qualitative Spatial Reasoning is a well explored area of Knowledge Representation and Reasoning and has multiple applications ranging from Geographical Information Systems to Robotics and Computer Vision. Recently, many claims have been made for the reasoning capabilities of Large Language Models (LLMs). Here, we investigate the extent to which a set of representative LLMs can perform classical qualitative spatial reasoning tasks on the mereotopological Region Connection Calculus, RCC-8. We conduct three pairs of experiments (reconstruction of composition tables, alignment to human composition preferences, conceptual neighbourhood reconstruction) using state-of-the-art LLMs; in each pair one experiment uses eponymous relations and one, anonymous relations (to test the extent to which the LLM relies on knowledge about the relation names obtained during training). All instances are repeated 30 times to measure the stochasticity of the LLMs.


ChatGPT got an upgrade to make it seem more human

New Scientist

OpenAI's latest model offers a more human-like conversational experience OpenAI announced its newest artificial intelligence model, called GPT-4o, which will soon power some versions of the company's ChatGPT product. The upgraded ChatGPT can swiftly respond to text, audio and video inputs from its real-time conversational partner – all while speaking with inflections and wording that convey a strong sense of emotion and personality. The company demonstrated the emotional mimicry of the new voice mode during a supposedly live OpenAI presentation, featuring both the ChatGPT mobile app and a new desktop app, on 13 May. Speaking in a female-sounding voice and responding to the name ChatGPT, the new AI's conversational capabilities seemed more akin to the personable AI voiced by Scarlett Johansson in the 2013 science fiction film Her than to the more canned and robotic responses of typical voice assistant technologies. How this moment for AI will change society forever (and how it won't) "The new GPT-4o voice-to-voice interaction more closely parallels human-human interaction," says Michelle Cohn at the University of California, Davis.


Object-agnostic Affordance Categorization via Unsupervised Learning of Graph Embeddings

arXiv.org Artificial Intelligence

Acquiring knowledge about object interactions and affordances can facilitate scene understanding and human-robot collaboration tasks. As humans tend to use objects in many different ways depending on the scene and the objects' availability, learning object affordances in everyday-life scenarios is a challenging task, particularly in the presence of an open set of interactions and objects. We address the problem of affordance categorization for class-agnostic objects with an open set of interactions; we achieve this by learning similarities between object interactions in an unsupervised way and thus inducing clusters of object affordances. A novel depth-informed qualitative spatial representation is proposed for the construction of Activity Graphs (AGs), which abstract from the continuous representation of spatio-temporal interactions in RGB-D videos. These AGs are clustered to obtain groups of objects with similar affordances. Our experiments in a real-world scenario demonstrate that our method learns to create object affordance clusters with a high V-measure even in cluttered scenes. The proposed approach handles object occlusions by capturing effectively possible interactions and without imposing any object or scene constraints.


Cohn

AAAI Conferences

Successful analysis of video data requires an integration of techniques from KR, Computer Vision, and Machine Learning. Being able to detect and to track objects as well as extracting their changing spatial relations with other objects is one approach to describing and detecting events. Different kinds of spatial relations are important, including topology, direction, size, and distance between objects as well as changes of those relations over time. Typically these kinds of relations are treated separately, which makes it difficult to integrate all the extracted spatial information. We present a uniform and comprehensive spatial representation of moving objects that includes all the above spatial/temporal aspects, analyse different properties of this representation and demonstrate that it is suitable for video analysis.


A human-like planner that allows robots to reach for objects in cluttered environments

#artificialintelligence

While research in the field of robotics has led to significant advances over the past few years, there are still substantial differences in how humans and robots handle objects. In fact, even the most sophisticated robots developed so far struggle to match the object manipulation skills of the average toddler. One particular aspect of object manipulation that most robots have not yet mastered is reaching and grasping for specific objects in a cluttered environment. To overcome this limitation, as part of an EPSRC-funded project, researchers at the University of Leeds have recently developed a human-like robotic planner that combines virtual reality (VR) and machine learning (ML) techniques. This new planner, introduced in a paper pre-published on arXiv and set to be presented at the International Conference on Robotics and Automation (ICRA), could enhance the performance of a variety of robots in object manipulation tasks.


CES gadget show: Surveillance is in -- and in a big way

The Japan Times

NEW YORK – From the face scanner that will check in some attendees to the cameras-everywhere array of digital products, the CES gadget show is all-in on surveillance technology -- whether it calls it that or not. Nestled in the "smart home" and "smart city" showrooms at the sprawling Las Vegas consumer tech conference are devices that see, hear and track the people they encounter. Some of them also analyze their looks and behavior. The technology on display includes eyelid-tracking car dashboard cameras to prevent distracted driving and "rapid DNA" kits for identifying a person from a cheek swab sample. All these talking speakers, doorbell cameras and fitness trackers come with the promise of making life easier or more fun, but they're also potentially powerful spying tools.


The Artificial Intelligence Task Force Wants to Do AI the Vermont Way

#artificialintelligence

Artificial Intelligence was once the stuff of science fiction. Now it's here, and every publication from the Washington Post to Wired to the Wall Street Journal is full of articles and videos exploring it. Depending on whom you listen to, AI will be a job killer or a job creator; a tool to boost productivity or Skynet from the Terminator movies; a technology that will dramatically transform society or an overhyped nothingburger. To help prepare for this uncertain and potentially disturbing future, Gov. Phil Scott impaneled an Artificial Intelligence Task Force in 2018. Its mandate: to "investigate the field of artificial intelligence" in the state and make recommendations for how the technology can be responsibly applied in Vermont's economy and government.


Sam Nunberg's Media Tour Tops This Week's Internet News Roundup

WIRED

Clocks move forward this weekend, which can only mean it's time for the East Coast to struggle under feet of snow once again. Well, that or it's time for Barack and Michelle Obama to team up with Netflix to produce shows to guide humanity into the future. While the world keeps turning, however, let's answer this one very important question: What was the rest of the internet up to last week? What Happened: In a move that surely delighted everyone who'd ever wanted to ignore all legal advice and do something stupid, one witness in the ongoing investigation into potential Russian interference in the 2016 election decided to do a media tour after receiving a subpoena for evidence. What Really Happened: Before last week, it's probably fair to say that most people hadn't heard of Sam Nunberg.