Goto

Collaborating Authors

 lexington


More Questions than Answers? Lessons from Integrating Explainable AI into a Cyber-AI Tool

Suh, Ashley, Li, Harry, Kenney, Caitlin, Alperin, Kenneth, Gomez, Steven R.

arXiv.org Artificial Intelligence

We share observations and challenges from an ongoing effort to implement Explainable AI (XAI) in a domain-specific workflow for cybersecurity analysts. Specifically, we briefly describe a preliminary case study on the use of XAI for source code classification, where accurate assessment and timeliness are paramount. We find that the outputs of state-of-the-art saliency explanation techniques (e.g., SHAP or LIME) are lost in translation when interpreted by people with little AI expertise, despite these techniques being marketed for non-technical users. Moreover, we find that popular XAI techniques offer fewer insights for real-time human-AI workflows when they are post hoc and too localized in their explanations. Instead, we observe that cyber analysts need higher-level, easy-to-digest explanations that can offer as little disruption as possible to their workflows. We outline unaddressed gaps in practical and effective XAI, then touch on how emerging technologies like Large Language Models (LLMs) could mitigate these existing obstacles.


Institutional Platform for Secure Self-Service Large Language Model Exploration

Bumgardner, V. K. Cody, Klusty, Mitchell A., Logan, W. Vaiden, Armstrong, Samuel E., Hickey, Caylin, Talbert, Jeff

arXiv.org Artificial Intelligence

This paper introduces a user-friendly platform developed by the University of Kentucky Center for Applied AI, designed to make large, customized language models (LLMs) more accessible. By capitalizing on recent advancements in multi-LoRA inference, the system efficiently accommodates custom adapters for a diverse range of users and projects. The paper outlines the system's architecture and key features, encompassing dataset curation, model training, secure inference, and text-based feature extraction. We illustrate the establishment of a tenant-aware computational network using agent-based methods, securely utilizing islands of isolated resources as a unified system. The platform strives to deliver secure LLM services, emphasizing process and data isolation, end-to-end encryption, and role-based resource authentication. This contribution aligns with the overarching goal of enabling simplified access to cutting-edge AI models and technology in support of scientific discovery.


Debate Helps Supervise Unreliable Experts

Michael, Julian, Mahdi, Salsabila, Rein, David, Petty, Jackson, Dirani, Julien, Padmakumar, Vishakh, Bowman, Samuel R.

arXiv.org Artificial Intelligence

As AI systems are used to answer more difficult questions and potentially help create new knowledge, judging the truthfulness of their outputs becomes more difficult and more important. How can we supervise unreliable experts, which have access to the truth but may not accurately report it, to give answers that are systematically true and don't just superficially seem true, when the supervisor can't tell the difference between the two on their own? In this work, we show that debate between two unreliable experts can help a non-expert judge more reliably identify the truth. We collect a dataset of human-written debates on hard reading comprehension questions where the judge has not read the source passage, only ever seeing expert arguments and short quotes selectively revealed by 'expert' debaters who have access to the passage. In our debates, one expert argues for the correct answer, and the other for an incorrect answer. Comparing debate to a baseline we call consultancy, where a single expert argues for only one answer which is correct half of the time, we find that debate performs significantly better, with 84% judge accuracy compared to consultancy's 74%. Debates are also more efficient, being 68% of the length of consultancies. By comparing human to AI debaters, we find evidence that with more skilled (in this case, human) debaters, the performance of debate goes up but the performance of consultancy goes down. Our error analysis also supports this trend, with 46% of errors in human debate attributable to mistakes by the honest debater (which should go away with increased skill); whereas 52% of errors in human consultancy are due to debaters obfuscating the relevant evidence from the judge (which should become worse with increased skill). Overall, these results show that debate is a promising approach for supervising increasingly capable but potentially unreliable AI systems.


Estimating See and Be Seen Performance with an Airborne Visual Acquisition Model

Underhill, Ngaire, Maki, Evan, Gill, Bilal, Weinert, Andrew

arXiv.org Artificial Intelligence

-- Separation provision and collision avoidance to avoid other air traffic are fundamental components of the layered conflict management system to ensure safe and efficient operations. Pilots have visual-based separation responsibilities to see and be seen to maintain separation between aircraft. To safely integrate into the airspace, drones should be required to have a minimum level of performance based on the safety achieved as baselined by crewed aircraft seen and be seen interactions. Drone interactions with crewed aircraft should not be more hazardous than interactions between traditional aviation aircraft. Accordingly, there is need for a methodology to design and evaluate detect and avoid systems, to be equipped by drones to mitigate the risk of a midair collision, where the methodology explicitly addresses, both semantically and mathematically, the appropriate operating rules associated with see and be seen. In response, we simulated how onboard pilots safely operate through see and be seen interactions using an updated visual acquisition model that was originally developed by J.W. Andrews decades ago. Monte Carlo simulations were representative two aircraft flying under visual flight rules and results were analyzed with respect to drone detect and avoid performance standards.


Texas teen rescued from suspected trafficker's NC shed may have met him through video game

FOX News

Jorge Ivan Santos Camacho, 34, is accused of grooming the teen online, driving down to Dallas to pick her up and abducting her to Lexington, where he allegedly sexually assaulted her and kept her locked in a shed where he was living. The North Carolina man accused of trafficking a Texas girl across the country and locking her in a shed may have met her through online video games, early missing person flyers show. Jorge Ivan Santos Camacho is charged with a slew of child sex crimes, including statutory rape and human trafficking, for allegedly taking the girl from her home in Dallas 1,000 miles away, to Lexington, North Carolina, where deputies found her locked in an outbuilding that he was living in, according to court documents. A missing person flyer circulating on March 4 said the girl had last been seen the evening of March 1, leaving her family's apartment wearing a hat with an image from the TV-MA-rated Japanese anime series "Demon Slayer." "She was engaged in gaming, and the family reported a suspicious message in the gaming account," the post reads.


Hitting the Books: How one of our first 'smart' weapons helped stop the Nazis

Engadget

At the outset of World War II, you'd have a better chance of finding a needle in a haystack with a camel stuck in its eye than you did shooting down an enemy aircraft in your first dozen or so shots. This is because anti-aircraft shells at the time used manual fuses that had to be dialed in for specific lengths of time to delay their explosion. The idea was that you'd estimate where the targeted plane would be in, say five seconds, based on its currently flight path, then time the shell for that length, fire the shell at the plane and hope that the timing and location were close enough that shrapnel from the exploding shell hits the plane. If your calculations were off by even a hair, the shell would miss by thousands of feet. And if shooting down piloted aircraft was this hard, intercepting Germany's terrifyingly fast V1 and V2 rockets required far more luck than skill. But that's exactly what the team at Section T set out to do.


Letters to the editor

#artificialintelligence

Artificial intelligence is an oxymoron (Technology quarterly, June 13th). Intelligence is an attribute of living things, and can best be defined as the use of information to further survival and reproduction. When a computer resists being switched off, or a robot worries about the future for its children, then, and only then, may intelligence flow. I acknowledge Richard Sutton's "bitter lesson", that attempts to build human understanding into computers rarely work, although there is nothing new here. I was aware of the folly of anthropomorphism as an AI researcher in the mid-1980s.