Goto

Collaborating Authors

 burglar


Santa Monica uses police drone to catch car burglar in the act

Los Angeles Times

Santa Monica Police spotted and stopped a man who was burglarizing vehicles in a parking lot near the pier by using a drone. On July 6, a Santa Monica police officer was directing the department's drone back to the station from a radio call when the officer decided to survey the Fourth of July weekend crowd near the pier and the nearby parking lots. As the drone flew over Lot 1 North, the parking lot next to the pier, he noticed a man wandering the lot, according to a video the department posted on their YouTube account. "As [the pilot] watched, the subject approached an unoccupied parked vehicle, pulled out tools from his sweatshirt and quickly punched open the lock of the driver's side door," the department said in the video. The drone footage shows the suspected burglar break the lock of the driver's side door of a black SUV then climb into the car.


MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures

Ni, Jinjie, Xue, Fuzhao, Yue, Xiang, Deng, Yuntian, Shah, Mahir, Jain, Kabir, Neubig, Graham, You, Yang

arXiv.org Artificial Intelligence

Evaluating large language models (LLMs) is challenging. Traditional ground-truth-based benchmarks fail to capture the comprehensiveness and nuance of real-world queries, while LLM-as-judge benchmarks suffer from grading biases and limited query quantity. Both of them may also become contaminated over time. User-facing evaluation, such as Chatbot Arena, provides reliable signals but is costly and slow. In this work, we propose MixEval, a new paradigm for establishing efficient, gold-standard LLM evaluation by strategically mixing off-the-shelf benchmarks. It bridges (1) comprehensive and well-distributed real-world user queries and (2) efficient and fairly-graded ground-truth-based benchmarks, by matching queries mined from the web with similar queries from existing benchmarks. Based on MixEval, we further build MixEval-Hard, which offers more room for model improvement. Our benchmarks' advantages lie in (1) a 0.96 model ranking correlation with Chatbot Arena arising from the highly impartial query distribution and grading mechanism, (2) fast, cheap, and reproducible execution (6% of the time and cost of MMLU), and (3) dynamic evaluation enabled by the rapid and stable data update pipeline. We provide extensive meta-evaluation and analysis for our and existing LLM benchmarks to deepen the community's understanding of LLM evaluation and guide future research directions.


Silico-centric Theory of Mind

Mukherjee, Anirban, Chang, Hannah Hanwen

arXiv.org Artificial Intelligence

Theory of Mind (ToM) refers to the ability to attribute mental states, such as beliefs, desires, intentions, and knowledge, to oneself and others, and to understand that these mental states can differ from one's own and from reality. We investigate ToM in environments with multiple, distinct, independent AI agents, each possessing unique internal states, information, and objectives. Inspired by human false-belief experiments, we present an AI ('focal AI') with a scenario where its clone undergoes a human-centric ToM assessment. We prompt the focal AI to assess whether its clone would benefit from additional instructions. Concurrently, we give its clones the ToM assessment, both with and without the instructions, thereby engaging the focal AI in higher-order counterfactual reasoning akin to human mentalizing--with respect to humans in one test and to other AI in another. We uncover a discrepancy: Contemporary AI demonstrates near-perfect accuracy on human-centric ToM assessments. Since information embedded in one AI is identically embedded in its clone, additional instructions are redundant. Yet, we observe AI crafting elaborate instructions for their clones, erroneously anticipating a need for assistance. An independent referee AI agrees with these unsupported expectations. Neither the focal AI nor the referee demonstrates ToM in our 'silico-centric' test.


Group-based Robustness: A General Framework for Customized Robustness in the Real World

Lin, Weiran, Lucas, Keane, Eyal, Neo, Bauer, Lujo, Reiter, Michael K., Sharif, Mahmood

arXiv.org Artificial Intelligence

Machine-learning models are known to be vulnerable to evasion attacks that perturb model inputs to induce misclassifications. In this work, we identify real-world scenarios where the true threat cannot be assessed accurately by existing attacks. Specifically, we find that conventional metrics measuring targeted and untargeted robustness do not appropriately reflect a model's ability to withstand attacks from one set of source classes to another set of target classes. To address the shortcomings of existing methods, we formally define a new metric, termed group-based robustness, that complements existing metrics and is better-suited for evaluating model performance in certain attack scenarios. We show empirically that group-based robustness allows us to distinguish between models' vulnerability against specific threat models in situations where traditional robustness metrics do not apply. Moreover, to measure group-based robustness efficiently and accurately, we 1) propose two loss functions and 2) identify three new attack strategies. We show empirically that with comparable success rates, finding evasive samples using our new loss functions saves computation by a factor as large as the number of targeted classes, and finding evasive samples using our new attack strategies saves time by up to 99\% compared to brute-force search methods. Finally, we propose a defense method that increases group-based robustness by up to 3.52$\times$.


An AI that Can Resurrect Memories !

#artificialintelligence

Martina's few words: Last time I wondered: how could we be sure someone is the person she claims to be? We have talked about digital identity services, and I hypothesized that I would surely use some GAN applications if I ever wanted to fool them. Anyway, when Louis suggested talking about GANs in image restoration this week, I was still thinking about the identity problem. I mean, every digital image is an attempt to simulate reality. The human sight itself is subject to bias and errors.


Kadlec

AAAI Conferences

Recent development in game AI has seen action planning and its derivates being adapted for controlling agents in classical types of games, such as FPSs or RPGs. Complementary, one can seek new types of gameplay elements inspired by planning. We propose and formally define a new game "genre" called anticipation games and demonstrate that planning can be used as their key concept both at design time and run time. In an anticipation game, a human player observes a computer controlled agent or agents, tries to predict their actions and indirectly helps them to achieve their goal. The paper describes an example prototype of an anticipation game we developed. The player helps a burglar steal an artifact from a museum guarded by guard agents. The burglar has incomplete knowledge of the environment and his plan will contain pitfalls. The player has to identify these pitfalls by observing burglar's behavior and change the environment so that the burglar replans and avoids the pitfalls. The game prototype is evaluated in a small-scale human-subject study, which suggests that the anticipation game concept is promising.


Smart security: how to keep your home safe while you're away

The Guardian

With the prospect of trips out and holidays finally on the cards, over the next few weeks many of us will be leaving our houses unattended for the first time in months. So now is the time to think about making your home a bit more secure. In addition to the basics, there is a range of DIY tech that may help to deter would-be intruders and allow you to keep an eye on your home from almost anywhere in the world. From the simple video doorbell to a fully automated light show, here is a quick guide to what is available, plus some tips from police and insurers to help you keep burglars at bay. Simple to install, easy to use and immediately obvious to potential intruders, video doorbells are a popular home security option.


Roomba Mistaken For a Burglar by Spooked Homeowners

#artificialintelligence

"I was so embarrassed," said the home owner who called the police on his Roomba. "They asked how long we had it, I said two days and they all started laughing." A couple watching a movie in their bedroom in North Carolina heard a sound downstairs and thought someone had broken into their home. Husband and father Thomas Milam called the police. The operator asked if he was armed and Milam said he was.


Homeowner calls 911 on suspected burglar that turned out to be vacuum

FOX News

Fox News Flash top headlines for Dec. 29 are here. Check out what's clicking on Foxnews.com At least their house was clean. A North Carolina homeowner called police claiming an intruder was inside his house in the middle of the night, a published report said. The red-faced homeowners were a bit hasty in their assessment, however, when the suspected burglar turned out to be a rogue robot vacuum cleaner, USA Today reported.


Oregon Man Called Police About A Burglar. Armed Officers Found A Rogue Roomba

NPR Technology

Instead, they found a trapped robotic vacuum cleaner. Three seasoned deputies -- one with at least 20 years on the force -- a detective who happened to be in the area, and two canine officers all responded to the call of a burglary in progress at a Cedar Hills home near Portland, Ore. Instead, they found a trapped robotic vacuum cleaner. The Washington County sheriff in Oregon says there was nothing unusual about the call. Sure, it was broad daylight -- 1:48 p.m. local time exactly -- but "crime can happen anytime."