Goto

Collaborating Authors

 Asia


Tokyo government builds infrastructure to expand use of generative AI

The Japan Times

The Tokyo Metropolitan Government is developing a Generative AI Platform, which will allow government employees to create AI applications to assist with their work. The Tokyo Metropolitan Government and municipal governments throughout the Japanese capital are increasingly using generative artificial intelligence in their administrative operations. To support this trend, the metropolitan government is working with GovTech Tokyo, an affiliated organization that promotes digitalization in local governments, to develop a Generative AI Platform. The system will allow government employees to create generative AI applications tailored to their specific duties. By encouraging active use of the platform, Tokyo authorities aim to boost efficiency in public services and address growing concerns over labor shortages. In a time of both misinformation and too much information, quality journalism is more crucial than ever.


Phantom flight: Iran war creates 9,100-km round trips to nowhere

The Japan Times

Since the conflict in the Middle East began on Feb. 28, Emirates has cancelled more than 2,000 flights -- 54% of scheduled services, according to data from Cirium. As Emirates flight EK10 from London cruised over Saudi Arabia on Monday, news broke of a drone strike at its destination, Dubai. The aircraft turned back to Gatwick, flight data shows, completing a 9,100 km round trip -- one of dozens of flights to nowhere triggered by the Middle East war. Roughly 30 Emirates flights heading to Dubai International Airport were also ordered back or rerouted after Iranian drone attacks temporarily shut what is normally the world's busiest airport for international passengers. Passengers expecting a dawn landing in the glitzy United Arab Emirates port city were stunned. In a time of both misinformation and too much information, quality journalism is more crucial than ever.


NTT Global Data Centers plans to double capacity in AI boom

The Japan Times

NTT Global Data Centers is working on 34 projects to double its capacity to 4 gigawatts within as little as two years, CEO Doug Adams said, as it races to meet surging global demand driven by the AI boom. NTT Global Data Centers, the world's third-largest data center provider outside of China, is working to double its capacity to 4 gigawatts to meet the rising global demand for the critical digital infrastructure amid an artificial intelligence boom. The unit of Japan's NTT is working on 34 projects that will double its capacity in as soon as two years, according to the data center business's Chief Executive Officer Doug Adams. Capacity will continue to increase from there, and will be "well over 5 gigawatts" in five years, Adams said in an interview. NTT GDC has seen increasing demand from companies moving more of their software and operations to the cloud as well as businesses hunting for extra capacity to run AI programs. The business's revenue is expected to keep growing at more than 20% a year, Adams said, declining to give a specific time period.


Ryan Gosling on bringing humour to sci-fi adventure Project Hail Mary

BBC News

Humour and science fiction may not seem obvious bedfellows but a history of cinema will tell you different. Think Spaceballs, Mars Attacks! and Everything Everwhere All At Once to name but a few. And now Ryan Gosling is hopping on board. The 45-year-old is both the lead actor and producer of Project Hail Mary, a space adventure film based on the 2021 Andy Weir novel of the same name. While Gosling has showcased his comedy chops in films such as Barbie and Nice Guys, he tells the BBC he's always struggled as an actor because I would want to bring humour to something but has found opportunities to be funny limited with some projects.


Kriging via variably scaled kernels

arXiv.org Machine Learning

Classical Gaussian processes and Kriging models are commonly based on stationary kernels, whereby correlations between observations depend exclusively on the relative distance between scattered data. While this assumption ensures analytical tractability, it limits the ability of Gaussian processes to represent heterogeneous correlation structures. In this work, we investigate variably scaled kernels as an effective tool for constructing non-stationary Gaussian processes by explicitly modifying the correlation structure of the data. Through a scaling function, variably scaled kernels alter the correlations between data and enable the modeling of targets exhibiting abrupt changes or discontinuities. We analyse the resulting predictive uncertainty via the variably scaled kernel power function and clarify the relationship between variably scaled kernels-based constructions and classical non-stationary kernels. Numerical experiments demonstrate that variably scaled kernels-based Gaussian processes yield improved reconstruction accuracy and provide uncertainty estimates that reflect the underlying structure of the data


AR-Flow VAE: A Structured Autoregressive Flow Prior Variational Autoencoder for Unsupervised Blind Source Separation

arXiv.org Machine Learning

Blind source separation (BSS) seeks to recover latent source signals from observed mixtures. Variational autoencoders (VAEs) offer a natural perspective for this problem: the latent variables can be interpreted as source components, the encoder can be viewed as a demixing mapping from observations to sources, and the decoder can be regarded as a remixing process from inferred sources back to observations. In this work, we propose AR-Flow VAE, a novel VAE-based framework for BSS in which each latent source is endowed with a parameter-adaptive autoregressive flow prior. This prior significantly enhances the flexibility of latent source modeling, enabling the framework to capture complex non-Gaussian behaviors and structured dependencies, such as temporal correlations, that are difficult to represent with conventional priors. In addition, the structured prior design assigns distinct priors to different latent dimensions, thereby encouraging the latent components to separate into different source signals under heterogeneous prior constraints. Experimental results validate the effectiveness of the proposed architecture for blind source separation. More importantly, this work provides a foundation for future investigations into the identifiability and interpretability of AR-Flow VAE.


Gaussian Process Limit Reveals Structural Benefits of Graph Transformers

arXiv.org Machine Learning

Graph transformers are the state-of-the-art for learning from graph-structured data and are empirically known to avoid several pitfalls of message-passing architectures. However, there is limited theoretical analysis on why these models perform well in practice. In this work, we prove that attention-based architectures have structural benefits over graph convolutional networks in the context of node-level prediction tasks. Specifically, we study the neural network gaussian process limits of graph transformers (GAT, Graphormer, Specformer) with infinite width and infinite heads, and derive the node-level and edge-level kernels across the layers. Our results characterise how the node features and the graph structure propagate through the graph attention layers. As a specific example, we prove that graph transformers structurally preserve community information and maintain discriminative node representations even in deep layers, thereby preventing oversmoothing. We provide empirical evidence on synthetic and real-world graphs that validate our theoretical insights, such as integrating informative priors and positional encoding can improve performance of deep graph transformers.


Pretrained Multilingual Transformers Reveal Quantitative Distance Between Human Languages

arXiv.org Machine Learning

Understanding the distance between human languages is central to linguistics, anthropology, and tracing human evolutionary history. Yet, while linguistics has long provided rich qualitative accounts of cross-linguistic variation, a unified and scalable quantitative approach to measuring language distance remains lacking. In this paper, we introduce a method that leverages pretrained multilingual language models as systematic instruments for linguistic measurement. Specifically, we show that the spontaneously emerged attention mechanisms of these models provide a robust, tokenization-agnostic measure of cross-linguistic distance, termed Attention Transport Distance (ATD). By treating attention matrices as probability distributions and measuring their geometric divergence via optimal transport, we quantify the representational distance between languages during translation. Applying ATD to a large and diverse set of languages, we demonstrate that the resulting distances recover established linguistic groupings with high fidelity and reveal patterns aligned with geographic and contact-induced relationships. Furthermore, incorporating ATD as a regularizer improves transfer performance in low-resource machine translation. Our results establish a principled foundation for testing linguistic hypotheses using artificial neural networks. This framework transforms multilingual models into powerful tools for quantitative linguistic discovery, facilitating more equitable multilingual AI.


Self-Regularized Learning Methods

arXiv.org Machine Learning

We introduce a general framework for analyzing learning algorithms based on the notion of self-regularization, which captures implicit complexity control without requiring explicit regularization. This is motivated by previous observations that many algorithms, such as gradient-descent based learning, exhibit implicit regularization. In a nutshell, for a self-regularized algorithm the complexity of the predictor is inherently controlled by that of the simplest comparator achieving the same empirical risk. This framework is sufficiently rich to cover both classical regularized empirical risk minimization and gradient descent. Building on self-regularization, we provide a thorough statistical analysis of such algorithms including minmax-optimal rates, where it suffices to show that the algorithm is self-regularized -- all further requirements stem from the learning problem itself. Finally, we discuss the problem of data-dependent hyperparameter selection, providing a general result which yields minmax-optimal rates up to a double logarithmic factor and covers data-driven early stopping for RKHS-based gradient descent.


Murmurations, Mestre--Nagao sums, and Convolutional Neural Networks for elliptic curves

arXiv.org Machine Learning

We apply one-dimensional convolutional neural networks to the Frobenius traces of elliptic curves over $\mathbb{Q}$ and evaluate and interpret their predictive capacity. In keeping with similar experiments by Kazalicki--Vlah, Bujanović--Kazalicki--Novak, and Pozdnyakov, we observe high accuracy predictions for the analytic rank across a range of conductors. We interpret the prediction using saliency curves and explore the interesting interplay between murmurations and Mestre--Nagao sums, the details of which vary with the conductor and the (predicted) rank.