Goto

Collaborating Authors

 South America


Nvidia strikes bumper AI deals with Asia tech giants

BBC News

US chip giant Nvidia will supply more than 260,000 of its most advanced artificial intelligence (AI) chips to South Korea's government, as well as Samsung, LG, and Hyundai. The companies will all deploy the AI chips in factories to make everything from semiconductors and robots to autonomous vehicles and meant that South Korea can now produce intelligence as a new export, chief executive Jensen Huang said. Mr Huang did not disclose the value of the South Korean deals. It caps off a busy week for Nvidia, which on Wednesday became the first company to be valued at $5 trillion and on Thursday saw signs of a thaw in US-China trade relations that may mean it can export more of its chips to China . Speaking at a CEO summit on the sidelines of Asia Pacific Economic Cooperation (Apec) in Gyeongju, South Korea, Mr Huang added that with the chips, companies would be able to create digital twins with other factories around the world.


Russia-Ukraine war: List of key events, day 1,345

Al Jazeera

Trump-Xi meeting: Who has the upper hand? Could Trump go for a third term? Is the US eyeing its next Latin American target? Why is Trump tearing down parts of the White House? Russia's Ministry of Defence said its forces took control of the villages of Krasnohirske in Ukraine's Zaporizhia region and Sadove in the Kharkiv region, Russian state news agencies reported.


LLMs Process Lists With General Filter Heads

arXiv.org Artificial Intelligence

We investigate the mechanisms underlying a range of list-processing tasks in LLMs, and we find that LLMs have learned to encode a compact, causal representation of a general filtering operation that mirrors the generic "filter" function of functional programming. Using causal mediation analysis on a diverse set of list-processing tasks, we find that a small number of attention heads, which we dub filter heads, encode a compact representation of the filtering predicate in their query states at certain tokens. We demonstrate that this predicate representation is general and portable: it can be extracted and reapplied to execute the same filtering operation on different collections, presented in different formats, languages, or even in tasks. However, we also identify situations where transformer LMs can exploit a different strategy for filtering: eagerly evaluating if an item satisfies the predicate and storing this intermediate result as a flag directly in the item representations. Our results reveal that transformer LMs can develop human-interpretable implementations of abstract computational operations that generalize in ways that are surprisingly similar to strategies used in traditional functional programming patterns.


Toward a Public and Secure Generative AI: A Comparative Analysis of Open and Closed LLMs

arXiv.org Artificial Intelligence

Generative artificial intelligence (Gen AI) systems represent a critical technology with far-reaching implications across multiple domains of society. However, their deployment entails a range of risks and challenges that require careful evaluation. To date, there has been a lack of comprehensive, interdisciplinary studies offering a systematic comparison between open-source and proprietary (closed) generative AI systems, particularly regarding their respective advantages and drawbacks. This study aims to: i) critically evaluate and compare the characteristics, opportunities, and challenges of open and closed generative AI models; and ii) propose foundational elements for the development of an Open, Public, and Safe Gen AI framework. As a methodology, we adopted a combined approach that integrates three methods: literature review, critical analysis, and comparative analysis. The proposed framework outlines key dimensions, openness, public governance, and security, as essential pillars for shaping the future of trustworthy and inclusive Gen AI. Our findings reveal that open models offer greater transparency, auditability, and flexibility, enabling independent scrutiny and bias mitigation. In contrast, closed systems often provide better technical support and ease of implementation, but at the cost of unequal access, accountability, and ethical oversight. The research also highlights the importance of multi-stakeholder governance, environmental sustainability, and regulatory frameworks in ensuring responsible development.


SecureReviewer: Enhancing Large Language Models for Secure Code Review through Secure-aware Fine-tuning

arXiv.org Artificial Intelligence

Identifying and addressing security issues during the early phase of the development lifecycle is critical for mitigating the long-term negative impacts on software systems. Code review serves as an effective practice that enables developers to check their teammates' code before integration into the codebase. To streamline the generation of review comments, various automated code review approaches have been proposed, where LLM-based methods have significantly advanced the capabilities of automated review generation. However, existing models primarily focus on general-purpose code review, their effectiveness in identifying and addressing security-related issues remains underexplored. Moreover, adapting existing code review approaches to target security issues faces substantial challenges, including data scarcity and inadequate evaluation metrics. To address these limitations, we propose SecureReviewer, a new approach designed for enhancing LLMs' ability to identify and resolve security-related issues during code review. Specifically, we first construct a dataset tailored for training and evaluating secure code review capabilities. Leveraging this dataset, we fine-tune LLMs to generate code review comments that can effectively identify security issues and provide fix suggestions with our proposed secure-aware fine-tuning strategy. To mitigate hallucination in LLMs and enhance the reliability of their outputs, we integrate the RAG technique, which grounds the generated comments in domain-specific security knowledge. Additionally, we introduce SecureBLEU, a new evaluation metric designed to assess the effectiveness of review comments in addressing security issues. Experimental results demonstrate that SecureReviewer outperforms state-of-the-art baselines in both security issue detection accuracy and the overall quality and practical utility of generated review comments.


Evaluating the Impact of LLM-Assisted Annotation in a Perspectivized Setting: the Case of FrameNet Annotation

arXiv.org Artificial Intelligence

The use of LLM-based applications as a means to accelerate and/or substitute human labor in the creation of language resources and dataset is a reality. Nonetheless, despite the potential of such tools for linguistic research, comprehensive evaluation of their performance and impact on the creation of annotated datasets, especially under a perspectivized approach to NLP, is still missing. This paper contributes to reduction of this gap by reporting on an extensive evaluation of the (semi-)automatization of FrameNet-like semantic annotation by the use of an LLM-based semantic role labeler. The methodology employed compares annotation time, coverage and diversity in three experimental settings: manual, automatic and semi-automatic annotation. Results show that the hybrid, semi-automatic annotation setting leads to increased frame diversity and similar annotation coverage, when compared to the human-only setting, while the automatic setting performs considerably worse in all metrics, except for annotation time.


Approximating Human Preferences Using a Multi-Judge Learned System

arXiv.org Artificial Intelligence

Aligning LLM-based judges with human preferences is a significant challenge, as they are difficult to calibrate and often suffer from rubric sensitivity, bias, and instability. Overcoming this challenge advances key applications, such as creating reliable reward models for Reinforcement Learning from Human Feedback (RLHF) and building effective routing systems that select the best-suited model for a given user query. In this work, we propose a framework for modeling diverse, persona-based preferences by learning to aggregate outputs from multiple rubric-conditioned judges. We investigate the performance of this approach against naive baselines and assess its robustness through case studies on both human and LLM-judges biases. Our primary contributions include a persona-based method for synthesizing preference labels at scale and two distinct implementations of our aggregator: Generalized Additive Model (GAM) and a Multi-Layer Perceptron (MLP).


Evaluating Emotion Recognition in Spoken Language Models on Emotionally Incongruent Speech

arXiv.org Artificial Intelligence

ABSTRACT Advancements in spoken language processing have driven the development of spoken language models (SLMs), designed to achieve universal audio understanding by jointly learning text and audio representations for a wide range of tasks. Although promising results have been achieved, there is growing discussion regarding these models' generalization capabilities and the extent to which they truly integrate audio and text modalities in their internal representations. In this work, we evaluate four SLMs on the task of speech emotion recognition using a dataset of emotionally incongruent speech samples, a condition under which the semantic content of the spoken utterance conveys one emotion while speech expressiveness conveys another. Our results indicate that SLMs rely predominantly on textual semantics rather than speech emotion to perform the task, indicating that text-related representations largely dominate over acoustic representations. We release both the code and the Emotionally Incongruent Synthetic Speech dataset (EMIS) to the community.


Spinning genocide: How is Israel using US PR firms to frame its Gaza war?

Al Jazeera

Why did Israel launch air strikes on Gaza? Will the US plan for Gaza fail? 'We survived the war, we may not survive the ceasefire' Spinning genocide: How is Israel using US PR firms to frame its Gaza war? Israel has contracted at least three public relations companies to bolster its image online and among the United States' Christian right, filings under the Foreign Agents Registration Act (FARA) show. According to US Department of Justice records, Israel's Ministry of Foreign Affairs hired the newly established Bridges Partners, the Christian PR agency Show Faith by Works, and the online consultancy Clock Tower X via the European Havas Media Group. Israel is acutely conscious of the need to control how its war, in which it has killed more than 68,000 Palestinians, is perceived by its allies and sponsors in the US .


Verdicts in as Liam Hemsworth takes over as The Witcher

BBC News

The latest season of Netflix's The Witcher has landed - with one big difference. Former lead actor Henry Cavill has been replaced as main character Geralt of Rivia by Liam Hemsworth. The Australian has stepped in for the final two seasons of the fantasy show, based on a popular series of novels and video games. Previously, British actor Cavill had portrayed the title character, a monster hunter with supernatural abilities known as the White Wolf. When he announced he was passing the torch to Hemsworth in October 2022, describing him as a fantastic actor, not all fans agreed.