Goto

Collaborating Authors

 Oceania


Humpback whale songs have patterns that resemble human language

New Scientist

Humpback whale songs have statistical patterns in their structure that are remarkably similar to those seen in human language. While this doesn't mean the songs convey complex meanings like our sentences do, it hints that whales may learn their songs in a similar way to how human infants start to understand language. Only male humpback whales (Megaptera novaeangliae) sing, and the behaviour is thought to be important for attracting mates. The songs are constantly evolving, with new elements appearing and spreading through the population until the old song is completely replaced with a new one. We're finally realising that many species are "We think it's a little bit like a standardised test, where everybody's got to do the same task but you can make changes and embellishments to show that you're better at the task than everybody else," says Jenny Allen at Griffith University in Gold Coast, Australia.


Which countries have banned DeepSeek and why?

Al Jazeera

This week, government agencies in countries including South Korea and Australia have blocked access to Chinese artificial intelligence (AI) startup DeepSeek's new AI chatbot programme, mostly for government employees. Other countries, including the United States, have said they may also seek to block DeepSeek from government employees' mobile devices, according to media reports. All cite "security concerns" about the Chinese technology and a lack of clarity about how users' personal information is handled by the operator. Last month, DeepSeek made headlines after it caused share prices in US tech companies to plummet, after it claimed that its model would cost only a fraction of the money its competitors had spent on their own AI programmes to build. The news caused social media users to joke: "I can't believe ChatGPT lost its job to AI." Here's what we know about DeepSeek and why countries are banning it.


Decoder-Only LLMs are Better Controllers for Diffusion Models

arXiv.org Artificial Intelligence

Groundbreaking advancements in text-to-image generation have recently been achieved with the emergence of diffusion models. These models exhibit a remarkable ability to generate highly artistic and intricately detailed images based on textual prompts. However, obtaining desired generation outcomes often necessitates repetitive trials of manipulating text prompts just like casting spells on a magic mirror, and the reason behind that is the limited capability of semantic understanding inherent in current image generation models. Specifically, existing diffusion models encode the text prompt input with a pre-trained encoder structure, which is usually trained on a limited number of image-caption pairs. The state-of-the-art large language models (LLMs) based on the decoder-only structure have shown a powerful semantic understanding capability as their architectures are more suitable for training on very large-scale unlabeled data. In this work, we propose to enhance text-to-image diffusion models by borrowing the strength of semantic understanding from large language models, and devise a simple yet effective adapter to allow the diffusion models to be compatible with the decoder-only structure. Meanwhile, we also provide a supporting theoretical analysis with various architectures (e.g., encoder-only, encoder-decoder, and decoder-only), and conduct extensive empirical evaluations to verify its effectiveness. The experimental results show that the enhanced models with our adapter module are superior to the stat-of-the-art models in terms of text-to-image generation quality and reliability.


Synthetic Datasets for Machine Learning on Spatio-Temporal Graphs using PDEs

arXiv.org Artificial Intelligence

Many physical processes can be expressed through partial differential equations (PDEs). Real-world measurements of such processes are often collected at irregularly distributed points in space, which can be effectively represented as graphs; however, there are currently only a few existing datasets. Our work aims to make advancements in the field of PDE-modeling accessible to the temporal graph machine learning community, while addressing the data scarcity problem, by creating and utilizing datasets based on PDEs. In this work, we create and use synthetic datasets based on PDEs to support spatio-temporal graph modeling in machine learning for different applications. More precisely, we showcase three equations to model different types of disasters and hazards in the fields of epidemiology, atmospheric particles, and tsunami waves. Further, we show how such created datasets can be used by benchmarking several machine learning models on the epidemiological dataset. Additionally, we show how pre-training on this dataset can improve model performance on real-world epidemiological data. The presented methods enable others to create datasets and benchmarks customized to individual requirements.


Automating a Complete Software Test Process Using LLMs: An Automotive Case Study

arXiv.org Artificial Intelligence

Vehicle API testing verifies whether the interactions between a vehicle's internal systems and external applications meet expectations, ensuring that users can access and control various vehicle functions and data. However, this task is inherently complex, requiring the alignment and coordination of API systems, communication protocols, and even vehicle simulation systems to develop valid test cases. In practical industrial scenarios, inconsistencies, ambiguities, and interdependencies across various documents and system specifications pose significant challenges. This paper presents a system designed for the automated testing of in-vehicle APIs. By clearly defining and segmenting the testing process, we enable Large Language Models (LLMs) to focus on specific tasks, ensuring a stable and controlled testing workflow. Experiments conducted on over 100 APIs demonstrate that our system effectively automates vehicle API testing. The results also confirm that LLMs can efficiently handle mundane tasks requiring human judgment, making them suitable for complete automation in similar industrial contexts.


Scale-Insensitive Neural Network Significance Tests

arXiv.org Machine Learning

This paper develops a scale-insensitive framework for neural network significance testing, substantially generalizing existing approaches through three key innovations. First, we replace metric entropy calculations with Rademacher complexity bounds, enabling the analysis of neural networks without requiring bounded weights or specific architectural constraints. Second, we weaken the regularity conditions on the target function to require only Sobolev space membership $H^s([-1,1]^d)$ with $s > d/2$, significantly relaxing previous smoothness assumptions while maintaining optimal approximation rates. Third, we introduce a modified sieve space construction based on moment bounds rather than weight constraints, providing a more natural theoretical framework for modern deep learning practices. Our approach achieves these generalizations while preserving optimal convergence rates and establishing valid asymptotic distributions for test statistics. The technical foundation combines localization theory, sharp concentration inequalities, and scale-insensitive complexity measures to handle unbounded weights and general Lipschitz activation functions. This framework better aligns theoretical guarantees with contemporary deep learning practice while maintaining mathematical rigor.


An Empirical Study of Code Obfuscation Practices in the Google Play Store

arXiv.org Artificial Intelligence

The Android ecosystem is vulnerable to issues such as app repackaging, counterfeiting, and piracy, threatening both developers and users. To mitigate these risks, developers often employ code obfuscation techniques. However, while effective in protecting legitimate applications, obfuscation also hinders security investigations as it is often exploited for malicious purposes. As such, it is important to understand code obfuscation practices in Android apps. In this paper, we analyze over 500,000 Android APKs from Google Play, spanning an eight-year period, to investigate the evolution and prevalence of code obfuscation techniques. First, we propose a set of classifiers to detect obfuscated code, tools, and techniques and then conduct a longitudinal analysis to identify trends. Our results show a 13% increase in obfuscation from 2016 to 2023, with ProGuard and Allatori as the most commonly used tools. We also show that obfuscation is more prevalent in top-ranked apps and gaming genres such as Casino apps. To our knowledge, this is the first large-scale study of obfuscation adoption in the Google Play Store, providing insights for developers and security analysts.


CAMEF: Causal-Augmented Multi-Modality Event-Driven Financial Forecasting by Integrating Time Series Patterns and Salient Macroeconomic Announcements

arXiv.org Artificial Intelligence

Accurately forecasting the impact of macroeconomic events is critical for investors and policymakers. Salient events like monetary policy decisions and employment reports often trigger market movements by shaping expectations of economic growth and risk, thereby establishing causal relationships between events and market behavior. Existing forecasting methods typically focus either on textual analysis or time-series modeling, but fail to capture the multi-modal nature of financial markets and the causal relationship between events and price movements. To address these gaps, we propose CAMEF (Causal-Augmented Multi-Modality Event-Driven Financial Forecasting), a multi-modality framework that effectively integrates textual and time-series data with a causal learning mechanism and an LLM-based counterfactual event augmentation technique for causal-enhanced financial forecasting. Our contributions include: (1) a multi-modal framework that captures causal relationships between policy texts and historical price data; (2) a new financial dataset with six types of macroeconomic releases from 2008 to April 2024, and high-frequency real trading data for five key U.S. financial assets; and (3) an LLM-based counterfactual event augmentation strategy. We compare CAMEF to state-of-the-art transformer-based time-series and multi-modal baselines, and perform ablation studies to validate the effectiveness of the causal learning mechanism and event types.


Joint State and Noise Covariance Estimation

arXiv.org Artificial Intelligence

This paper tackles the problem of jointly estimating the noise covariance matrix alongside primary parameters (such as poses and points) from measurements corrupted by Gaussian noise. In such settings, the noise covariance matrix determines the weights assigned to individual measurements in the least squares problem. We show that the joint problem exhibits a convex structure and provide a full characterization of the optimal noise covariance estimate (with analytical solutions) within joint maximum a posteriori and likelihood frameworks and several variants. Leveraging this theoretical result, we propose two novel algorithms that jointly estimate the primary parameters and the noise covariance matrix. To validate our approach, we conduct extensive experiments across diverse scenarios and offer practical insights into their application in robotics and computer vision estimation problems with a particular focus on SLAM.


MedGNN: Towards Multi-resolution Spatiotemporal Graph Learning for Medical Time Series Classification

arXiv.org Artificial Intelligence

Medical time series has been playing a vital role in real-world healthcare systems as valuable information in monitoring health conditions of patients. Accurate classification for medical time series, e.g., Electrocardiography (ECG) signals, can help for early detection and diagnosis. Traditional methods towards medical time series classification rely on handcrafted feature extraction and statistical methods; with the recent advancement of artificial intelligence, the machine learning and deep learning methods have become more popular. However, existing methods often fail to fully model the complex spatial dynamics under different scales, which ignore the dynamic multi-resolution spatial and temporal joint inter-dependencies. Moreover, they are less likely to consider the special baseline wander problem as well as the multi-view characteristics of medical time series, which largely hinders their prediction performance. To address these limitations, we propose a Multi-resolution Spatiotemporal Graph Learning framework, MedGNN, for medical time series classification. Specifically, we first propose to construct multi-resolution adaptive graph structures to learn dynamic multi-scale embeddings. Then, to address the baseline wander problem, we propose Difference Attention Networks to operate self-attention mechanisms on the finite difference for temporal modeling. Moreover, to learn the multi-view characteristics, we utilize the Frequency Convolution Networks to capture complementary information of medical time series from the frequency domain. In addition, we introduce the Multi-resolution Graph Transformer architecture to model the dynamic dependencies and fuse the information from different resolutions. Finally, we have conducted extensive experiments on multiple medical real-world datasets that demonstrate the superior performance of our method. Our Code is available.