Goto

Collaborating Authors

 Energy


Risk Analysis Techniques for Governed LLM-based Multi-Agent Systems

arXiv.org Artificial Intelligence

Organisations are starting to adopt LLM-based AI agents, with their deployments naturally evolving from single agents towards interconnected, multi-agent networks. Yet a collection of safe agents does not guarantee a safe collection of agents, as interactions between agents over time create emergent behaviours and induce novel failure modes. This means multi-agent systems require a fundamentally different risk analysis approach than that used for a single agent. This report addresses the early stages of risk identification and analysis for multi-agent AI systems operating within governed environments where organisations control their agent configurations and deployment. In this setting, we examine six critical failure modes: cascading reliability failures, inter-agent communication failures, monoculture collapse, conformity bias, deficient theory of mind, and mixed motive dynamics. For each, we provide a toolkit for practitioners to extend or integrate into their existing frameworks to assess these failure modes within their organisational contexts. Given fundamental limitations in current LLM behavioural understanding, our approach centres on analysis validity, and advocates for progressively increasing validity through staged testing across stages of abstraction and deployment that gradually increases exposure to potential negative impacts, while collecting convergent evidence through simulation, observational analysis, benchmarking, and red teaming. This methodology establishes the groundwork for robust organisational risk management as these LLM-based multi-agent systems are deployed and operated.


Enhancing Retrieval-Augmented Generation for Electric Power Industry Customer Support

arXiv.org Artificial Intelligence

Many AI customer service systems use standard NLP pipelines or finetuned language models, which often fall short on ambiguous, multi-intent, or detail-specific queries. This case study evaluates recent techniques: query rewriting, RAG Fusion, keyword augmentation, intent recognition, and context reranking, for building a robust customer support system in the electric power domain. We compare vector-store and graph-based RAG frameworks, ultimately selecting the graph-based RAG for its superior performance in handling complex queries. We find that query rewriting improves retrieval for queries using non-standard terminology or requiring precise detail. RAG Fusion boosts performance on vague or multifaceted queries by merging multiple retrievals. Reranking reduces hallucinations by filtering irrelevant contexts. Intent recognition supports the decomposition of complex questions into more targeted sub-queries, increasing both relevance and efficiency. In contrast, keyword augmentation negatively impacts results due to biased keyword selection. Our final system combines intent recognition, RAG Fusion, and reranking to handle disambiguation and multi-source queries. Evaluated on both a GPT-4-generated dataset and a real-world electricity provider FAQ dataset, it achieves 97.9% and 89.6% accuracy respectively, substantially outperforming baseline RAG models.


iTFKAN: Interpretable Time Series Forecasting with Kolmogorov-Arnold Network

arXiv.org Artificial Intelligence

As time evolves, data within specific domains exhibit predictability that motivates time series forecasting to predict future trends from historical data. However, current deep forecasting methods can achieve promising performance but generally lack interpretability, hindering trustworthiness and practical deployment in safety-critical applications such as auto-driving and healthcare. In this paper, we propose a novel interpretable model, iTFKAN, for credible time series forecasting. iTFKAN enables further exploration of model decision rationales and underlying data patterns due to its interpretability achieved through model symbolization. Besides, iTFKAN develops two strategies, prior knowledge injection, and time-frequency synergy learning, to effectively guide model learning under complex intertwined time series data. Extensive experimental results demonstrated that iTFKAN can achieve promising forecasting performance while simultaneously possessing high interpretive capabilities.



Can an AI chatbot of Dr Karl change climate sceptics' minds? He's willing to give it a try

The Guardian

There's arguably no face, voice or collection of exuberant, patterned shirts more recognisable than those belonging to Dr Karl Kruszelnicki. The bespectacled boffin has been answering curly listener questions about science, with characteristic excitement and passion, for more than 40 years. Despite a seemingly tireless work ethic, Kruszelnicki, now 77 years old, can't be everywhere all at once. Those questions now come in waves, across social media platforms at all hours of the day. "Sometimes I get 300 requests a day on Twitter to answer an involved question about climate change," Kruszelnicki says.


Ukraine says it hit Russian oil refinery in drone exchanges; key talks loom

Al Jazeera

Ukraine's military has said it struck an oil refinery in Russia's Saratov region in an overnight drone attack, causing explosions and destruction, according to an army statement, as daily aerial exchanges intensify with diplomatic momentum to end the war in play. Saratov's governor said on Sunday that one person was killed and several residential apartments and an industrial facility were damaged, but did not mention the oil refinery being struck. "[Ukrainian] drones are targeting … deeper into Russian territory [than] in the past, where previous attacks have been focused on the line of contact in the south and the western parts of Russia," said Al Jazeera's Osama Bin Javaid, reporting from Moscow. It is still unclear whether Ukraine's claims that it hit a refinery are true, he added. Ukraine's military also said on Sunday that it had taken back a village in the Sumy region from the Russian army, which has made significant recent gains there.


What if L.A.'s so-called flaws were underappreciated assets rather than liabilities?

Los Angeles Times

In the wake of January's horrific fires, detractors of Los Angeles -- an urban reality often seen as a toxic mixture of unsustainable resource planning and structurally poor governance systems -- are having a field day. Los Angeles knows how to weather a crisis -- or two or three. Angelenos are tapping into that resilience, striving to build a city for everyone. Their criticism is not new: For most of the 20th century -- and certainly for the last five decades or so -- Los Angeles has been seen by many urbanists as less city and more cautionary tale -- a smoggy expanse of subdivisions and spaghetti junctions, where ambition came with a two-hour commute. Planners shuddered, while architects looked away, even as they accepted handsome commissions to build some of L.A.'s -- if not the world's -- most iconic buildings.


'It's missing something': AGI, superintelligence and a race for the future

The Guardian

That was how Sam Altman, chief executive of OpenAI, described the latest upgrade to ChatGPT this week. The race Altman was referring to was artificial general intelligence (AGI), a theoretical state of AI where, by OpenAI's definition, a highly autonomous system is able to do a human's job. Describing the new GPT-5 model, which will power ChatGPT, as a "significant step on the path to AGI", he nonetheless added a hefty caveat. "[It is] missing something quite important, many things quite important," said Altman, such as the model's inability to "continuously learn" even after its launch. In other words, these systems are impressive but they have yet to crack the autonomy that would allow them to do a full-time job.


OpenAI will not disclose GPT-5's energy use. It could be higher than past models

The Guardian

In mid-2023, if a user asked OpenAI's ChatGPT for a recipe for artichoke pasta or instructions on how to make a ritual offering to the ancient Canaanite deity Moloch, its response might have taken – very roughly – 2 watt-hours, or about as much electricity as an incandescent bulb consumes in 2 minutes. OpenAI released a model on Thursday that will underpin the popular chatbot – GPT-5. Ask that version of the AI for an artichoke recipe, and the same amount of pasta-related text could take several times – even 20 times – that amount of energy, experts say. As it rolled out GPT-5, the company highlighted the model's breakthrough capabilities: its ability to create websites, answer PhD-level science questions, and reason through difficult problems. But experts who have spent the past years working to benchmark the energy and resource usage of AI models say those new powers come at a cost: a response from GPT-5 may take a significantly larger amount of energy than a response from previous versions of ChatGPT.