Goto

Collaborating Authors

 Sevastopol


Ukrainian drones strike Sevastopol museum and key Russian oil refineries

Al Jazeera

Ukrainian drones have struck a historic museum in Russia-annexed Sevastopol in Crimea, igniting a roof fire, as Russian authorities slashed nighttime train schedules amid intensifying air attacks across the peninsula and deep into Russia. Sevastopol's Russian-installed governor, Mikhail Razvozhayev, announced the damage on Telegram early on Wednesday. "This building is not just a museum, it is a symbol of resilience, which has repeatedly taken the blows of the enemy." Razvozhayev said that during World War II's Siege of Sevastopol, "the Panorama building was subjected to massed bombing by German aviation". He declared: "The enemy will pay for this sacrilege!"


Ukraine says it carried out first-ever underwater drone strike on Russian submarine in Novorossiysk

FOX News

This material may not be published, broadcast, rewritten, or redistributed. Quotes displayed in real-time or delayed by at least 15 minutes. Market data provided by Factset . Powered and implemented by FactSet Digital Solutions . Mutual Fund and ETF data provided by Refinitiv Lipper .


Ukraine's 'Spiderweb' drone assault forces Russia to shelter, move aircraft

Al Jazeera

Russia's increased sense of vulnerability may be the most important result of a recent large-scale Ukrainian drone attack named Operation Spiderweb, experts tell Al Jazeera. The operation destroyed as much as a third of Russia's strategic bomber fleet on the tarmac of four airfields deep inside Russia on June 1. Days later, Russia started to build shelters for its bombers and relocate them. An open source intelligence (OSINT) researcher nicknamed Def Mon posted time-lapse satellite photographs on social media showing major excavations at the Kirovskoe airfield in annexed Crimea as well as in Sevastopol, Gvardiyskoye and Saki, where Russia was constructing shelters for military aircraft. They reported similar work at several airbases in Russia, including the Engels base, which was targeted in Ukraine's attacks on June 1.


Ukraine bombs Russian bases: Here are some of Kyiv's most audacious attacks

Al Jazeera

Ukrainian drones struck multiple military airbases deep inside Russia on Sunday in a major operation a day before the neighbours held peace talks in Istanbul. The Russian Defence Ministry said Ukraine had launched drone strikes targeting Russian military airfields across five regions, causing several aircraft to catch fire. The attacks occurred in the Murmansk, Irkutsk, Ivanovo, Ryazan, and Amur regions. Air defences repelled the assaults in all but two regions โ€“ Murmansk and Irkutsk, the ministry said. "In the Murmansk and Irkutsk regions, the launch of FPV drones from an area in close proximity to airfields resulted in several aircraft catching fire," the Defence Ministry said.


SPIN-Bench: How Well Do LLMs Plan Strategically and Reason Socially?

arXiv.org Artificial Intelligence

Reasoning and strategic behavior in social interactions is a hallmark of intelligence. This form of reasoning is significantly more sophisticated than isolated planning or reasoning tasks in static settings (e.g., math problem solving). In this paper, we present Strategic Planning, Interaction, and Negotiation (SPIN-Bench), a new multi-domain evaluation designed to measure the intelligence of strategic planning and social reasoning. While many existing benchmarks focus on narrow planning or single-agent reasoning, SPIN-Bench combines classical PDDL tasks, competitive board games, cooperative card games, and multi-agent negotiation scenarios in one unified framework. The framework includes both a benchmark as well as an arena to simulate and evaluate the variety of social settings to test reasoning and strategic behavior of AI agents. We formulate the benchmark SPIN-Bench by systematically varying action spaces, state complexity, and the number of interacting agents to simulate a variety of social settings where success depends on not only methodical and step-wise decision making, but also conceptual inference of other (adversarial or cooperative) participants. Our experiments reveal that while contemporary LLMs handle basic fact retrieval and short-range planning reasonably well, they encounter significant performance bottlenecks in tasks requiring deep multi-hop reasoning over large state spaces and socially adept coordination under uncertainty. We envision SPIN-Bench as a catalyst for future research on robust multi-agent planning, social reasoning, and human--AI teaming. Project Website: https://spinbench.github.io/


DSGBench: A Diverse Strategic Game Benchmark for Evaluating LLM-based Agents in Complex Decision-Making Environments

arXiv.org Artificial Intelligence

Large Language Model~(LLM) based agents have been increasingly popular in solving complex and dynamic tasks, which requires proper evaluation systems to assess their capabilities. Nevertheless, existing benchmarks usually either focus on single-objective tasks or use overly broad assessing metrics, failing to provide a comprehensive inspection of the actual capabilities of LLM-based agents in complicated decision-making tasks. To address these issues, we introduce DSGBench, a more rigorous evaluation platform for strategic decision-making. Firstly, it incorporates six complex strategic games which serve as ideal testbeds due to their long-term and multi-dimensional decision-making demands and flexibility in customizing tasks of various difficulty levels or multiple targets. Secondly, DSGBench employs a fine-grained evaluation scoring system which examines the decision-making capabilities by looking into the performance in five specific dimensions and offering a comprehensive assessment in a well-designed way. Furthermore, DSGBench also incorporates an automated decision-tracking mechanism which enables in-depth analysis of agent behaviour patterns and the changes in their strategies. We demonstrate the advances of DSGBench by applying it to multiple popular LLM-based agents and our results suggest that DSGBench provides valuable insights in choosing LLM-based agents as well as improving their future development. DSGBench is available at https://github.com/DeciBrain-Group/DSGBench.


HoT: Highlighted Chain of Thought for Referencing Supporting Facts from Inputs

arXiv.org Artificial Intelligence

An Achilles heel of Large Language Models (LLMs) is their tendency to hallucinate non-factual statements. A response mixed of factual and non-factual statements poses a challenge for humans to verify and accurately base their decisions on. To combat this problem, we propose Highlighted Chain-of-Thought Prompting (HoT), a technique for prompting LLMs to generate responses with XML tags that ground facts to those provided in the query. That is, given an input question, LLMs would first re-format the question to add XML tags highlighting key facts, and then, generate a response with highlights over the facts referenced from the input. Interestingly, in few-shot settings, HoT outperforms vanilla chain of thought prompting (CoT) on a wide range of 17 tasks from arithmetic, reading comprehension to logical reasoning. When asking humans to verify LLM responses, highlights help time-limited participants to more accurately and efficiently recognize when LLMs are correct. Yet, surprisingly, when LLMs are wrong, HoTs tend to make users believe that an answer is correct.


BordIRlines: A Dataset for Evaluating Cross-lingual Retrieval-Augmented Generation

arXiv.org Artificial Intelligence

Large language models excel at creative generation but continue to struggle with the issues of hallucination and bias. While retrieval-augmented generation (RAG) provides a framework for grounding LLMs' responses in accurate and up-to-date information, it still raises the question of bias: which sources should be selected for inclusion in the context? And how should their importance be weighted? In this paper, we study the challenge of cross-lingual RAG and present a dataset to investigate the robustness of existing systems at answering queries about geopolitical disputes, which exist at the intersection of linguistic, cultural, and political boundaries. Our dataset is sourced from Wikipedia pages containing information relevant to the given queries and we investigate the impact of including additional context, as well as the composition of this context in terms of language and source, on an LLM's response. Our results show that existing RAG systems continue to be challenged by cross-lingual use cases and suffer from a lack of consistency when they are provided with competing information in multiple languages. We present case studies to illustrate these issues and outline steps for future research to address these challenges. We make our dataset and code publicly available at https://github.com/manestay/bordIRlines.


Ukraine's navy chief says Russian warships are leaving Crimean hub in Black Sea

FOX News

The Russian navy's Black Sea Fleet has been forced to rebase nearly all its combat-ready warships from occupied Crimea to other locations, and its main naval hub is becoming ineffectual because of attacks by Kyiv, Ukraine's navy chief said. Vice-Admiral Oleksiy Neizhpapa said Ukrainian missile and naval drone strikes had caused heavy damage to the Sevastopol base, a logistics hub for repairs, maintenance, training and ammunition storage among other important functions for Russia. "They were established over many decades, possibly centuries. And clearly they are now losing this hub," Neizhpapa told Reuters in a rare interview in the port city of Odesa ahead of Ukraine Navy Day on Sunday. More than 28 months since Russia's full-scale invasion, Kyiv has dealt a series of stinging blows to Moscow in the Black Sea although Ukrainian ground troops are on the back foot across a sprawling front.


Ukrainian maritime attack on Black Sea port Novorossiysk repelled: Russia

Al Jazeera

Russia says it destroyed two Ukrainian sea drones targeting the Black Sea port of Novorossiysk, a key naval base and oil shipping outlet. The Ministry of Defence in Moscow said on Wednesday that Russian forces had destroyed the naval drones as they advanced on the port in an overnight attack. Ukraine has reported success in targeting Russian ships and infrastructure in the Black Sea over recent months. "Two unmanned boats travelling in the direction of Novorossiysk were destroyed in the waters of the Black Sea," the ministry said in a post on Telegram. The attack caused no damage or shipping disruptions, the local city administration reported, according to Russian state news agencies.