Asia
Trio charged over alleged plot to smuggle Nvidia chips from US to China
A trio linked with a US technology supplier have been charged over a ploy to smuggle American artificial intelligence (AI) chips to China, the Department of Justice said on Thursday. The individuals allegedly conspired to sell billions of dollars' worth of technology to buyers in China by faking documents and using dummy equipment to slip past audits, according to the DOJ. The goods in question included Nvidia-made semiconductors, highly coveted AI chips which are subject to export controls. In August 2025, two Chinese nationals were also arrested and charged with illegally shipping millions of dollars' worth of Nvidia chips to China. The DOJ said in a statement on Thursday that it had arrested US-citizen Yih-Shyan Wally Liaw and Taiwanese citizen Ting-Wei Willy Sun, while Ruei-Tsang Steven Chang, a Taiwanese citizen, remains a fugitive.
BTS Arirang review: K-pop idols rekindle their fire
The return of BTS is a big deal. In case you were in any doubt, just look at the frenzy surrounding the South Koreans' comeback. On Saturday, the band will kick off a sold-out, 82-date world tour with a free concert in Seoul, which is expected to be attended by more than 250,000 in-person fans and will be live-streamed on Netflix to more than 190 countries. When the tour wraps up in 2027, BTS are expected to have generated more than $1billion in revenue. Some more outlandish estimates suggest they will eclipse the $2billion haul of Taylor Swift's Eras tour.
As OpenClaw enthusiasm grips China, school kids and retirees alike raise 'lobsters'
As OpenClaw enthusiasm grips China, school kids and retirees alike raise'lobsters' Zhipu staff members help residents install and setup AutoClaw, a local version of the AI agent OpenClaw developed by Zhipu, at an office building in Beijing. BEIJING - Fan Xinquan, a retired electronics worker in Beijing, has recently started raising a lobster, hoping that the AI agent he has been training can help organize his specialized industry knowledge better than chatbots like DeepSeek. OpenClaw can actually help you accomplish many practical things, the 60-year-old said at a recent event hosted by AI startup Zhipu to teach people how to use and train the AI agent, which has gone viral in China, with its various local versions earning the lobster nickname. In the past month, OpenClaw, which can connect several hardware and software tools and learn from the data produced with much less human intervention than a chatbot, has captured the imaginations of many in China, from retirees looking for side income to AI firms hoping to generate new revenue streams. In a time of both misinformation and too much information, quality journalism is more crucial than ever. By subscribing, you can help us get the story right.
The Truncation Blind Spot: How Decoding Strategies Systematically Exclude Human-Like Token Choices
Arias, Esteban Garces, Sapargali, Nurzhan, Heumann, Christian, Aßenmacher, Matthias
Standard decoding strategies for text generation, including top-k, nucleus sampling, and contrastive search, select tokens based on likelihood, restricting selection to high-probability regions. Human language production operates differently: tokens are chosen for communicative appropriateness rather than statistical frequency. This mismatch creates a truncation blind spot: contextually appropriate but statistically rare tokens remain accessible to humans yet unreachable by likelihood-based decoding. We hypothesize this contributes to the detectability of machine-generated text. Analyzing over 1.8 million texts across eight language models, five decoding strategies, and 53 hyperparameter configurations, we find that 8-18% of human-selected tokens fall outside typical truncation boundaries. Simple classifiers trained on predictability and lexical diversity achieve remarkable detection rates. Crucially, neither model scale nor architecture correlates strongly with detectability; truncation parameters account for most variance. Configurations achieving low detectability often produce incoherent text, indicating that evading detection and producing natural text are distinct objectives. These findings suggest detectability is enhanced by likelihood-based token selection, not merely a matter of model capability.
Learning-to-Defer with Expert-Conditioned Advice
Montreuil, Yannis, Montreuil, Leïna, Carlier, Axel, Ng, Lai Xing, Ooi, Wei Tsang
Learning-to-Defer routes each input to the expert that minimizes expected cost, but it assumes that the information available to every expert is fixed at decision time. Many modern systems violate this assumption: after selecting an expert, one may also choose what additional information that expert should receive, such as retrieved documents, tool outputs, or escalation context. We study this problem and call it Learning-to-Defer with advice. We show that a broad family of natural separated surrogates, which learn routing and advice with distinct heads, is inconsistent even in the smallest non-trivial setting. We then introduce an augmented surrogate that operates on the composite expert--advice action space and prove an $\mathcal{H}$-consistency guarantee together with an excess-risk transfer bound, yielding recovery of the Bayes-optimal policy in the limit. Experiments on tabular, language, and multi-modal tasks show that the resulting method improves over standard Learning-to-Defer while adapting its advice-acquisition behavior to the cost regime; a synthetic benchmark confirms the failure mode predicted for separated surrogates.
SRRM: Improving Recursive Transport Surrogates in the Small-Discrepancy Regime
Zhang, Yufei, Wang, Tao, Zhang, Jingyi
Recursive partitioning methods provide computationally efficient surrogates for the Wasserstein distance, yet their statistical behavior and their resolution in the small-discrepancy regime remain insufficiently understood. We study Recursive Rank Matching (RRM) as a representative instance of this class under a population-anchored reference. In this setting, we establish consistency and an explicit convergence rate for the anchored empirical RRM under the quadratic cost. We then identify a dominant mismatch mechanism responsible for the loss of resolution in the small-discrepancy regime. Based on this analysis, we introduce Selective Recursive Rank Matching (SRRM), which suppresses the resulting dominant mismatches and yields a higher-fidelity practical surrogate for the Wasserstein distance at moderate additional computational cost.
BoundAD: Boundary-Aware Negative Generation for Time Series Anomaly Detection
Wang, Xiancheng, Wang, Lin, Zhang, Zhibo, Wang, Rui, Zhao, Minghang
Contrastive learning methods for time series anomaly detection (TSAD) heavily depend on the quality of negative sample construction. However, existing strategies based on random perturbations or pseudo-anomaly injection often struggle to simultaneously preserve temporal semantic consistency and provide effective decision-boundary supervision. Most existing methods rely on prior anomaly injection, while overlooking the potential of generating hard negatives near the data manifold boundary directly from normal samples themselves. To address this issue, we propose a reconstruction-driven boundary negative generation framework that automatically constructs hard negatives through the reconstruction process of normal samples. Specifically, the method first employs a reconstruction network to capture normal temporal patterns, and then introduces a reinforcement learning strategy to adaptively adjust the optimization update magnitude according to the current reconstruction state. In this way, boundary-shifted samples close to the normal data manifold can be induced along the reconstruction trajectory and further used for subsequent contrastive representation learning. Unlike existing methods that depend on explicit anomaly injection, the proposed framework does not require predefined anomaly patterns, but instead mines more challenging boundary negatives from the model's own learning dynamics. Experimental results show that the proposed method effectively improves anomaly representation learning and achieves competitive detection performance on the current dataset.
The Exponentially Weighted Signature
Bloch, Alexandre, Cohen, Samuel N., Lyons, Terry, Mouterde, Joël, Walker, Benjamin
The signature is a canonical representation of a multidimensional path over an interval. However, it treats all historical information uniformly, offering no intrinsic mechanism for contextualising the relevance of the past. To address this, we introduce the Exponentially Weighted Signature (EWS), generalising the Exponentially Fading Memory (EFM) signature from diagonal to general bounded linear operators. These operators enable cross-channel coupling at the level of temporal weighting together with richer memory dynamics including oscillatory, growth, and regime-dependent behaviour, while preserving the algebraic strengths of the classical signature. We show that the EWS is the unique solution to a linear controlled differential equation on the tensor algebra, and that it generalises both state-space models and the Laplace and Fourier transforms of the path. The group-like structure of the EWS enables efficient computation and makes the framework amenable to gradient-based learning, with the full semigroup action parametrised by and learned through its generator. We use this framework to empirically demonstrate the expressivity gap between the EWS and both the signature and EFM on two SDE-based regression tasks.
Learning to Reason with Curriculum I: Provable Benefits of Autocurriculum
Rajaraman, Nived, Huang, Audrey, Dudik, Miro, Schapire, Robert, Foster, Dylan J., Krishnamurthy, Akshay
Chain-of-thought reasoning, where language models expend additional computation by producing thinking tokens prior to final responses, has driven significant advances in model capabilities. However, training these reasoning models is extremely costly in terms of both data and compute, as it involves collecting long traces of reasoning behavior from humans or synthetic generators and further post-training the model via reinforcement learning. Are these costs fundamental, or can they be reduced through better algorithmic design? We show that autocurriculum, where the model uses its own performance to decide which problems to focus training on, provably improves upon standard training recipes for both supervised fine-tuning (SFT) and reinforcement learning (RL). For SFT, we show that autocurriculum requires exponentially fewer reasoning demonstrations than non-adaptive fine-tuning, by focusing teacher supervision on prompts where the current model struggles. For RL fine-tuning, autocurriculum decouples the computational cost from the quality of the reference model, reducing the latter to a burn-in cost that is nearly independent of the target accuracy. These improvements arise purely from adaptive data selection, drawing on classical techniques from boosting and learning from counterexamples, and requiring no assumption on the distribution or difficulty of prompts.