Goto

Collaborating Authors

 Law


From Capabilities to Performance: Evaluating Key Functional Properties of LLM Architectures in Penetration Testing

arXiv.org Artificial Intelligence

Large language models (LLMs) are increasingly used to automate or augment penetration testing, but their effectiveness and reliability across attack phases remain unclear. We present a comprehensive evaluation of multiple LLM-based agents, from single-agent to modular designs, across realistic penetration testing scenarios, measuring empirical performance and recurring failure patterns. We also isolate the impact of five core functional capabilities via targeted augmentations: Global Context Memory (GCM), Inter-Agent Messaging (IAM), Context-Conditioned Invocation (CCI), Adaptive Planning (AP), and Real-Time Monitoring (RTM). These interventions support, respectively: (i) context coherence and retention, (ii) inter-component coordination and state management, (iii) tool use accuracy and selective execution, (iv) multi-step strategic planning, error detection, and recovery, and (v) real-time dynamic responsiveness. Our results show that while some architectures natively exhibit subsets of these properties, targeted augmentations substantially improve modular agent performance, especially in complex, multi-step, and real-time penetration testing tasks.


Retrieval-Augmented Generation for Reliable Interpretation of Radio Regulations

arXiv.org Artificial Intelligence

We study question answering in the domain of radio regulations, a legally sensitive and high-stakes area. We propose a telecom-specific Retrieval-Augmented Generation (RAG) pipeline and introduce, to our knowledge, the first multiple-choice evaluation set for this domain, constructed from authoritative sources using automated filtering and human validation. To assess retrieval quality, we define a domain-specific retrieval metric, under which our retriever achieves approximately 97% accuracy. Beyond retrieval, our approach consistently improves generation accuracy across all tested models. In particular, while naively inserting documents without structured retrieval yields only marginal gains for GPT-4o (less than 1%), applying our pipeline results in nearly a 12% relative improvement. These findings demonstrate that carefully targeted grounding provides a simple yet strong baseline and an effective domain-specific solution for regulatory question answering. All code and evaluation scripts, along with our derived question-answer dataset, are available at https://github.com/Zakaria010/Radio-RAG.


Dual-Mode Deep Anomaly Detection for Medical Manufacturing: Structural Similarity and Feature Distance

arXiv.org Artificial Intelligence

Automated visual inspection in medical-device manufacturing faces unique challenges, including extremely low defect rates, limited annotated data, hardware restrictions on production lines, and the need for validated, explainable artificial-intelligence systems. This paper presents two attention-guided autoencoder architectures that address these constraints through complementary anomaly-detection strategies. The first employs a multi-scale structural-similarity (4-MS-SSIM) index for inline inspection, enabling interpretable, real-time defect detection on constrained hardware. The second applies a Mahalanobis-distance analysis of randomly reduced latent features for efficient feature-space monitoring and lifecycle verification. Both approaches share a lightweight backbone optimised for high-resolution imagery for typical manufacturing conditions. Evaluations on the Surface Seal Image (SSI) dataset-representing sterile-barrier packaging inspection-demonstrate that the proposed methods outperform reference baselines, including MOCCA, CPCAE, and RAG-PaDiM, under realistic industrial constraints. Cross-domain validation on the MVTec-Zipper benchmark confirms comparable accuracy to state-of-the-art anomaly-detection methods. The dual-mode framework integrates inline anomaly detection and supervisory monitoring, advancing explainable AI architectures toward greater reliability, observability, and lifecycle monitoring in safety-critical manufacturing environments. To facilitate reproducibility, the source code developed for the experiments has been released in the project repository, while the datasets were obtained from publicly available sources.


Supplemental Materials

Neural Information Processing Systems

We bear all responsibility in case of violation of rights, etc., and confirmation of the data license. This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International This license permits sharing and adapting the work provided it is not used for commercial purposes and appropriate credit is given. Please refer to Section 3 for our hosting plan. In this section, we use the framework of Datasheets for Datasets [? ] to form a datasheet for CRAG, For what purpose was the dataset created? Was there a specific task in mind?


Jeffrey Epstein Claimed Intimate Knowledge of Donald Trump's Views in Texts With Bill Gates Adviser

WIRED

In text messages from 2017, Jeffrey Epstein seemingly represented himself as positioned to pass information from the Trump White House to Bill Gates through an intermediary. In text messages sent in 2017, disgraced financier and registered sex offender Jeffrey Epstein appears to position himself as a middleman between president Donald Trump's administration and Microsoft cofounder Bill Gates, even seemingly representing himself as passing on information directly from Trump to Gates through an intermediary. The messages, which the House Committee on Oversight and Government Reform released on Wednesday and originated with the Epstein estate, begin on January 27, 2017, years after Epstein had already pleaded guilty to state prostitution solicitation charges. In them, Epstein purports to show intimate awareness of Trump's plans for domestic and global public health policy, and to be directly familiar with the president's thinking. Trump has continued to claim, as recently as this summer, that he stopped speaking with Epstein around 2004.


UK billionaire Joe Lewis receives pardon from Trump

BBC News

Billionaire UK businessman Joe Lewis, whose family trust owns Tottenham Hotspur football club, has received a pardon from US President Donald Trump. Lewis, 88, pleaded guilty to insider trading as part of an agreement with prosecutors in 2024 that saw him avoid prison. He was accused of passing on information about his companies to his private pilots, friends, personal assistants and romantic partners in a fraud that authorities said netted millions of dollars in profit. A White House official said Trump approved the pardon for Lewis, who requested it so he could receive medical treatment and visit his grandchildren and great grandchildren in the US. Mr Lewis admitted he made a terrible mistake, did not fight extradition in the case, and paid a $5 million fine, the official told the BBC.


Stewart Rhodes Relaunched the Oath Keepers. Even Old Oath Keepers Don't Care

WIRED

Militia leader Stewart Rhodes, who was convicted for his role in the January 6 attack, is asking potential new members and supporters to send money. Stewart Rhodes announced last week that he is relaunching the Oath Keepers, his anti-government militia which virtually disappeared after dozens of its members--including Rhodes--were arrested for their roles in the January 6 attack on the Capitol . Rhodes, speaking to the Gateway Pundit this week, says that he sees the relaunched group as playing a role in combating what he labeled an "insurrection by the left" on the streets of US cities. "Right now, under federal statutes, president Trump can call us up as the militia if he sees it necessary, especially for three purposes: to repel invasions, to suppress insurrections, and to execute the laws of the union," Rhodes said. But in the days since Rhodes announced their return, experts, former members, and online chatter suggest there is little to no interest in restarting what was, at one point, one of the largest militias in America with a leaked database listing 38,000 supposed members in 2021. This hasn't stopped Rhodes from asking potential new members and supporters to send money in support of the cause.


Lack of trust and racism concerns: Five key failings in Sara Sharif review

BBC News

An independent review of the Sara Sharif case has identified multiple failings from agencies before her murder in Surrey in 2023, following two years of abuse. The child safeguarding practice review, published on Thursday, said there were clearly several points in Sara's life, in particular during the last few months, where different actions could and should have been taken by the authorities. The system failed to keep her safe, it added. Responding to the report, the Children's Commissioner said the case was a catalogue of missed opportunities, poor communication and ill-informed assumptions. The education secretary said there had been the glaring failures across all agencies.


5 Things to Know Before Using an AI Browser

TIME - Tech

A smartphone shows the official website of ChatGPT Atlas. A smartphone shows the official website of ChatGPT Atlas. "It'd be really nice to have a service that was sort of just observing your life and proactively helping you when you needed it," said OpenAI CEO Sam Altman in a recent Q&A about OpenAI's plans. This vision is at the heart of a new crop of AI browsers, notably OpenAI's ChatGPT Atlas and Perplexity's Comet. AI browsers differ from traditional browsers in at least two important ways.


AI Relationships Are on the Rise. A Divorce Boom Could Be Next

WIRED

AI Relationships Are on the Rise. Secret chatbot flings are creating new legal challenges for married couples when it comes to infidelity. Rebecca Palmer isn't a psychic, but as a divorce attorney she can often see what's coming next. For many people today, as AI saturates every aspect of life --from work to therapy--the allure of an AI romance is tantalizing. Chatbots are dependable, can provide emotional support, and, for the most part, will never pick a fight with you.