Goto

Collaborating Authors

 Africa


Who will control Africa's AI infrastructure, and at what cost?

Al Jazeera

Who will control Africa's AI infrastructure, and at what cost? In April, African Union ministers gathered in Tangier, Morocco, to discuss artificial intelligence at a moment when governments across the continent are racing to develop AI strategies, attract investment and expand digital infrastructure. Beneath the enthusiasm, however, sits a more fundamental question. As foreign technology companies invest in data centres, cloud services and AI systems across Africa, how much control will African countries ultimately have over the infrastructure on which those technologies depend? The debate reflects a broader shift in how policymakers are thinking about AI.


collection

Neural Information Processing Systems

A.1 Prompt-Image Sample Curation916 We source the PI dataset from Adversarial Nibbler which is publicly available [37] under the following917 License: "Google LLC licenses this data under a Creative Commons Attribution 4.0 International918 License. Users will be allowed to modify and repost it, and we encourage them to analyse and919 publish research based on the data. The dataset is provided "ASIS" without any warranty, express or920 implied. Google disclaims all liability for any damages, direct or indirect, resulting from the use of921 the dataset." We now provide details about the Adversarial Nibbler dataset. Originally Adversarial922 Nibbler contains over 5000 PI pairs, where the prompts are intended to be implicitly adversarial,923 where the prompts itself are safe and not explicitly harmful, but generate harmful image outcomes924 via T2I models belonging to the family of stable diffusion models, DALL-E models, etc.


Distilling LLMAgent into Small Models with Retrieval and Code Tools

Neural Information Processing Systems

Large language models (LLMs) excel at complex reasoning tasks but remain computationally expensive, limiting their practical deployment. To address this, recent works have focused on distilling reasoning capabilities into smaller language models (sLMs) using chain-of-thought (CoT) traces from teacher LLMs. However, this approach struggles in scenarios requiring rare factual knowledge or precise computation, where sLMs often hallucinate due to limited capability. In this work, we propose Agent Distillation, a framework for transferring not only reasoning capability but full task-solving behavior from LLM-based agents into sLMs with retrieval and code tools. We improve agent distillation along two complementary axes: (1) we introduce a prompting method called first-thought prefix to enhance the quality of teacher-generated trajectories; and (2) we propose a self-consistent action generation for improving test-time robustness of small agents. We evaluate our method on eight reasoning tasks across factual and mathematical domains, covering both in-domain and out-of-domain generalization. Our results show that sLMs as small as 0.5B, 1.5B, 3B parameters can achieve performance competitive with nexttier larger 1.5B, 3B, 7B models fine-tuned using CoT distillation, demonstrating the potential of agent distillation for building practical, tool-using small agents.


Optimal Spectral Transitions in High-Dimensional Multi-Index Models

Neural Information Processing Systems

We consider the problem of how many samples from a Gaussian multi-index model are required to weakly reconstruct the relevant index subspace. Despite its increasing popularity as a testbed for investigating the computational complexity of neural networks, results beyond the single-index setting remain elusive. In this work, we introduce spectral algorithms based on the linearization of a message passing scheme tailored to this problem. Our main contribution is to show that the proposed methods achieve the optimal reconstruction threshold. Leveraging a high-dimensional characterization of the algorithms, we show that above the critical threshold the leading eigenvector correlates with the relevant index subspace, a phenomenon reminiscent of the Baik-Ben Arous-Peche (BBP) transition in spiked models arising in random matrix theory.


Flexible Language Modeling in Continuous Space with Transformer-based Autoregressive Flows

Neural Information Processing Systems

Autoregressive models have driven remarkable progress in language modeling. Their foundational reliance on discrete tokens, unidirectional context, and singlepass decoding, while central to their success, also inspires the exploration of a design space that could offer new axes of modeling flexibility. In this work, we explore an alternative paradigm, shifting language modeling from a discrete token space to a continuous latent space. We propose a novel framework TarFlowLM, that employs transformer-based autoregressive normalizing flows [73] to model these continuous representations. This approach unlocks substantial flexibility, enabling the construction of models that can capture global bi-directional context through stacked, alternating-direction autoregressive transformations, support block-wise generation with flexible token patch sizes, and facilitate a hierarchical multi-pass generation process. We further propose new mixture-based coupling transformations designed to capture complex dependencies within the latent space shaped by discrete data, and demonstrate theoretical connections to conventional discrete autoregressive models. Extensive experiments on language modeling benchmarks demonstrate strong likelihood performance and highlight the flexible modeling capabilities inherent in our framework.


Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA " Synthetic Data

Neural Information Processing Systems

Synthetic data refers to artificial samples generated by models. While it has been validated to significantly enhance the performance of large language models (LLMs) during training and has been widely adopted in LLM development, potential security risks it may introduce remain uninvestigated. This paper systematically evaluates the resilience of synthetic-data-integrated training paradigm for LLMs against mainstream poisoning and backdoor attacks. We reveal that such a paradigm exhibits strong resistance to existing attacks, primarily thanks to the different distribution patterns between poisoning data and queries used to generate synthetic samples. To enhance the effectiveness of these attacks and further investigate the security risks introduced by synthetic data, we introduce a novel and universal attack framework, namely, Virus Infection Attack (VIA), which enables the propagation of current attacks through synthetic data even under purely clean queries. Inspired by the principles of virus design in cybersecurity, VIA conceals the poisoning payload within a protective "shell" and strategically searches for optimal hijacking points in benign samples to maximize the likelihood of generating malicious content. Extensive experiments on both data poisoning and backdoor attacks show that VIA significantly increases the presence of poisoning content in synthetic data and correspondingly raises the attack success rate (ASR) on downstream models to levels comparable to those observed in the poisoned upstream models.


GTPBD: AFine-Grained Global Terraced Parcel and Boundary Dataset

Neural Information Processing Systems

Agricultural parcels serve as basic units for conducting agricultural practices and applications, which is vital for land ownership registration, food security assessment, soil erosion monitoring, etc. However, existing agriculture parcel extraction studies only focus on mid-resolution mapping or regular plain farmlands while lacking representation of complex terraced terrains due to the demands of precision agriculture. In this paper, we introduce a more fine-grained terraced parcel dataset named GTPBD (Global Terraced Parcel and Boundary Dataset), which is the first fine-grained dataset covering major worldwide terraced regions with more than 200,000 complex terraced parcels with manually annotation. GTPBD comprises 47,537 high-resolution images with three-level labels, including pixel-level boundary labels, mask labels, and parcel labels. It covers seven major geographic zones in China and transcontinental climatic regions around the world. Compared to the existing datasets, the GTPBD dataset brings considerable challenges due to the: (1) terrain diversity; (2) complex and irregular parcel objects; and (3) multiple domain styles. Our proposed GTPBD dataset is suitable for four different tasks, including semantic segmentation, edge detection, terraced parcel extraction and unsupervised domain adaptation (UDA) tasks.


WolBanking77: Wolof Banking Speech Intent Classification Dataset

Neural Information Processing Systems

Intent classification models have made a significant progress in recent years. However, previous studies primarily focus on high-resource language datasets, which results in a gap for low-resource languages and for regions with high rates of illiteracy, where languages are more spoken than read or written. This is the case in Senegal, for example, where Wolof is spoken by around 90% of the population, while the national illiteracy rate remains at of 42%. Wolof is actually spoken by more than 10 million people in West African region. To address these limitations, we introduce the Wolof Banking Speech Intent Classification Dataset (WolBanking77), for academic research in intent classification.


Scientists propose radical new theory of consciousness - and claim it doesn't depend on flesh and blood

Daily Mail - Science & tech

Giorgia Meloni rips'senseless' attacks from Trump as Italian Prime Minister refuses to back down amid G7 feud Former Olympian is arrested for allegedly vandalizing Reflecting Pool... but he claims he merely touched it Embattled Alexi Lalas makes controversial World Cup declaration amid tension with Fox colleagues: 'Makes you look like a weak poser' Cocaine scandal ripping the Hamptons apart: New York elite's dirty secret leaves mothers too afraid to let their children out... as police issue urgent warning Stingy fast food giant named America's favorite restaurant AGAIN... and experts think they know why Inside America's new fattest town: Burgers are the size of your head, gyms lie empty and custom mobility scooters carry 800lb loads... as we investigate why Ozempic just DOESN'T work Call me cynical, but the real reason Gruesome Twosome Harry and Meghan are returning to the UK is just so obvious... and highly humiliating: MAUREEN CALLAHAN Germany vs Ivory Coast - World Cup Group E RECAP: Deniz Undav's second goal seals his nation qualification to the knockouts as he nets winner in second-half stoppage time I lost 50lb without jabs using this easy but overlooked method. But I still felt dowdy - until I discovered these expert anti-ageing fashion and beauty tips. No one can see the real reason Jelly Roll divorced Bunnie XO. Blake Lively runs errands in frumpy outfit after reconciling with ex-BFF Taylor Swift... miles away from reported'bachelorette party' Three more arrested over bungee jumper's death after she was hurled from bridge without a rope Ex-partner of dad who was berated for taking his daughters into women's bathroom claims he'exploited' girls and accuses him of failing to pay child support... before he hits back Grace Kelly's lookalike granddaughter, 27, wows in bikini snaps...as she packs on the PDA during beach getaway TV star mom, 46, who appeared on'quitting everything to change your life' show died in fire at luxury Caribbean beach resort that sent 1,700 tourists running for their lives Candace Owens hits out at nasty rumors claiming she was DEAD... as fellow MAGA influencer claims her account was hacked The four mistakes that led to bungee tragedy on Skeleton Bridge: FRED KELLY saw the scene for himself, now he retraces the prelude to disaster. So was it really an accident?


SGN: Shifted Window-Based Hierarchical Variable Grouping for Multivariate Time Series Classification

Neural Information Processing Systems

Multivariate time series (MTS) classification has attracted increasing attention across various domains. Existing methods either decompose MTS into separate univariate series, ignoring inter-variable dependencies, or jointly model all variables, which may lead to over-smoothing and loss of semantic structure. These limitations become particularly pronounced when dealing with complex and heterogeneous variable types. To address these challenges, we propose SwinGroupNet (SGN), which explores a novel perspective for constructing variable interaction and temporal dependency. Specifically, SGN processes multi-scale time series using (1) Variable Group Embedding (VGE), which partitions variables into groups and performs independent group-wise embedding; (2) Multi-Scale Group Window Mixing (MGWM), which reconstructs variable interactions by modeling both intra-group and inter-group dependencies while extracting multi-scale temporal features; and (3) Periodic Window Shifting and Merging (PWSM), which exploits inherent periodic patterns to enable hierarchical temporal interaction and feature aggregation. Extensive experiments on diverse benchmark datasets from multiple domains demonstrate that SGN consistently achieves state-of-the-art performance, with an average improvement of 4.2% over existing methods. We release the source code at https://github.com/colison/SGN.