Goto

Collaborating Authors

 export


How the Trump-Xi summit could set superpower relations for many years to come

BBC News

Security around Beijing's historic Tiananmen Square has been heightened for days, with rumours on social media swirling of a special parade or some big, choreographed event. Preparations for this major event have started with a whisper, but China appears ready to put on a show for US President Donald Trump. The visit will include talks, a banquet, and a visit to the Temple of Heaven, a complex of imperial temples where emperors would pray for a good harvest. And both Trump and Chinese President Xi Jinping will be hoping the visit will bear fruit. This summit between the world's two most powerful leaders is set to be one of the most consequential encounters for years.


A Visualization for Comparative Analysis of Regression Models

Mountasir, Nassime, Lafabregue, Baptiste, Albert, Bruno, Lachiche, Nicolas

arXiv.org Machine Learning

As regression is a widely studied problem, many methods have been proposed to solve it, each of them often requiring setting different hyper-parameters. Therefore, selecting the proper method for a given application may be very difficult and relies on comparing their performances. Performance is usually measured using various metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), or R-squared (R${}^2$). These metrics provide a numerical summary of predictive accuracy by quantifying the difference between predicted and actual values. However, while these metrics are widely used in the literature for summarizing model performance and useful to distinguish between models performing poorly and well, they often aggregate too much information. This article addresses these limitations by introducing a novel visualization approach that highlights key aspects of regression model performance. The proposed method builds upon three main contributions: (1) considering the residuals in a 2D space, which allows for simultaneous evaluation of errors from two models, (2) leveraging the Mahalanobis distance to account for correlations and differences in scale within the data, and (3) employing a colormap to visualize the percentile-based distribution of errors, making it easier to identify dense regions and outliers. By graphically representing the distribution of errors and their correlations, this approach provides a more detailed and comprehensive view of model performance, enabling users to uncover patterns that traditional aggregate metrics may obscure. The proposed visualization method facilitates a deeper understanding of regression model performance differences and error distributions, enhancing the evaluation and comparison process.


India's outsourcing industry is worth 300bn. Can it survive AI?

BBC News

India's outsourcing industry is worth $300bn. Indian technology stocks have seen an unprecedented rout over the past few weeks over fears of artificial intelligence upending the traditional outsourcing model that powers the country's $300bn (£223bn) back-office industry. The sell-off - part of a global correction in traditional software and IT stocks - preceded the market nervousness caused by recent geopolitical uncertainty, and is particularly significant for India. Over the past three-and-a-half decades, India's software industry has created millions of white-collar jobs, spawning a new middle class driven by high ambition and strong purchasing power. This, in turn, has fuelled demand for apartments, cars and restaurants across top-tier cities such as Bengaluru, Hyderabad and Gurugram over the past 30 years.


AI-driven pirated manga is booming. Can AI also help curb it?

The Japan Times

AI-driven pirated manga is booming. Can AI also help curb it? Japan's content industry -- which includes anime, manga and video games -- is a major export the country. Such exports were valued at ¥6 trillion ($38 billion) in 2024. When it comes to pirated manga online, which is being produced quicker thanks to artificial intelligence, government officials in Japan are planning to fight fire with fire and use AI to crack down on it.


Bin2Vec: Interpretable and Auditable Multi-View Binary Analysis for Code Plagiarism Detection

Moussaoui, Moussa, Houichime, Tarik, Sadiq, Abdelalim

arXiv.org Artificial Intelligence

We introduce Bin2Vec, a new framework that helps compare software programs in a clear and explainable way. Instead of focusing only on one type of information, Bin2Vec combines what a program looks like (its built-in functions, imports, and exports) with how it behaves when it runs (its instructions and memory usage). This gives a more complete picture when deciding whether two programs are similar or not. Bin2Vec represents these different types of information as views that can be inspected separately using easy-to-read charts, and then brings them together into an overall similarity score. Bin2Vec acts as a bridge between binary representations and machine learning techniques by generating feature representations that can be efficiently processed by machine-learning models. We tested Bin2Vec on multiple versions of two well-known Windows programs, PuTTY and 7-Zip. The primary results strongly confirmed that our method compute an optimal and visualization-friendly representation of the analyzed software. For example, PuTTY versions showed more complex behavior and memory activity, while 7-Zip versions focused more on performance-related patterns. Overall, Bin2Vec provides decisions that are both reliable and explainable to humans. Because it is modular and easy to extend, it can be applied to tasks like auditing, verifying software origins, or quickly screening large numbers of programs in cybersecurity and reverse-engineering work.


HCT-QA: A Benchmark for Question Answering on Human-Centric Tables

Ahmad, Mohammad S., Naeem, Zan A., Aupetit, Michaël, Elmagarmid, Ahmed, Eltabakh, Mohamed, Ma, Xiasong, Ouzzani, Mourad, Ruan, Chaoyi

arXiv.org Artificial Intelligence

Tabular data embedded within PDF files, web pages, and other document formats are prevalent across numerous sectors such as government, engineering, science, and business. These human-centric tables (HCTs) possess a unique combination of high business value, intricate layouts, limited operational power at scale, and sometimes serve as the only data source for critical insights. However, their complexity poses significant challenges to traditional data extraction, processing, and querying methods. While current solutions focus on transforming these tables into relational formats for SQL queries, they fall short in handling the diverse and complex layouts of HCTs and hence being amenable to querying. This paper describes HCT-QA, an extensive benchmark of HCTs, natural language queries, and related answers on thousands of tables. Our dataset includes 2,188 real-world HCTs with 9,835 QA pairs and 4,679 synthetic tables with 67.5K QA pairs. While HCTs can be potentially processed by different type of query engines, in this paper, we focus on Large Language Models as potential engines and assess their ability in processing and querying such tables.


GAZE:Governance-Aware pre-annotation for Zero-shot World Model Environments

Krishna, Leela, Zhao, Mengyang, Pasula, Saicharithreddy, Rajgarhia, Harshit, Mukherji, Abhishek

arXiv.org Artificial Intelligence

Training robust world models requires large-scale, precisely labeled multimodal datasets, a process historically bottlenecked by slow and expensive manual annotation. We present a production-tested GAZE pipeline that automates the conversion of raw, long-form video into rich, task-ready supervision for world-model training. Our system (i) normalizes proprietary 360-degree formats into standard views and shards them for parallel processing; (ii) applies a suite of AI models (scene understanding, object tracking, audio transcription, PII/NSFW/minor detection) for dense, multimodal pre-annotation; and (iii) consolidates signals into a structured output specification for rapid human validation. The GAZE workflow demonstrably yields efficiency gains (~19 minutes saved per review hour) and reduces human review volume by >80% through conservative auto-skipping of low-salience segments. By increasing label density and consistency while integrating privacy safeguards and chain-of-custody metadata, our method generates high-fidelity, privacy-aware datasets directly consumable for learning cross-modal dynamics and action-conditioned prediction. We detail our orchestration, model choices, and data dictionary to provide a scalable blueprint for generating high-quality world model training data without sacrificing throughput or governance.


British parts found in Russian drones, Zelensky says

BBC News

British microcomputers were among more than 100,000 foreign-made parts contained in Russian missiles and drones used in Sunday's deadly strikes on Ukraine, Volodymyr Zelensky has said. The Ukrainian president called for further effective sanctions after saying parts originating in allied countries including Germany, Japan and the US have been identified in Russian weapons. The Department for Business and Trade (DBT) said it had recently undertaken efforts to crack down on UK firms whose products have continued to make their way into Russia's military supply chain. We take reports of goods from UK companies being found in Russian weaponry incredibly seriously, a government spokesperson said. The spokesperson said the government had banned the export of thousands of goods to Russia including every battlefield item Ukraine has brought to our attention, adding that they have imposed the most the most severe package of sanctions. What are the sanctions on Russia and are they working?


ChannelFlow-Tools: A Standardized Dataset Creation Pipeline for 3D Obstructed Channel Flows

Kavane, Shubham, Kulkarni, Kajol, Koestler, Harald

arXiv.org Artificial Intelligence

We present ChannelFlow-Tools, a configuration-driven framework that standardizes the end-to-end path from programmatic CAD solid generation to ML-ready inputs and targets for 3D obstructed channel flows. The toolchain integrates geometry synthesis with feasibility checks, signed distance field (SDF) voxelization, automated solver orchestration on HPC (waLBerla LBM), and Cartesian resampling to co-registered multi-resolution tensors. A single Hydra/OmegaConf configuration governs all stages, enabling deterministic reproduction and controlled ablations. As a case study, we generate 10k+ scenes spanning Re=100-15000 with diverse shapes and poses. An end-to-end evaluation of storage trade-offs directly from the emitted artifacts, a minimal 3D U-Net at 128x32x32, and example surrogate models with dataset size illustrate that the standardized representations support reproducible ML training. ChannelFlow-Tools turns one-off dataset creation into a reproducible, configurable pipeline for CFD surrogate modeling.


L.A. County residents illegally exported 'sensitive' high-power AI microchips to China, feds allege

Los Angeles Times

Two Los Angeles County residents face federal charges after they were arrested on suspicion of illegally exporting tens of millions of dollars' worth of artificial intelligence microchips to China, authorities said. Chuan Geng, 28, of Pasadena; and Shiwei Yang, 28, of El Monte, were taken into custody on Saturday for their alleged involvement in the illegal overseas export of processing units used in modern computing and artificial intelligence applications, according to a statement from the U.S. attorney's office for the Eastern District of California. Federal prosecutors said both were Chinese nationals, though Geng is a lawful permanent resident of the U.S. Yang, however, was in the country illegally as she had overstayed her visa, according to authorities. Yaoning'Mike' Sun of Chino Hills is charged with acting as an illegal agent of a foreign power and conspiring to advance China-friendly policies in local government. In a criminal complaint, U.S. Justice Department officials alleged the pair had "knowingly and willingly" undercut federal export regulations to conceal illegal shipments to China for nearly three years.