Goto

Collaborating Authors

 Australia


Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM

Neural Information Processing Systems

Where does'A man is walking in a Locate the moment where "A man For the query'A man recommends narrow alley, with street noise and Determine the precise timestamp in wearing a white mask is speaking visiting local areas in Tokyo, filming the conversations in the background.


Thirsty and power hungry: Australia is in the middle of a datacentre boom โ€“ but not everyone is convinced

The Guardian

There are about 160 datacentres operating in Australia, with another 90 proposed. There are about 160 datacentres operating in Australia, with another 90 proposed. They're a key part of the digital and AI economy, but they come at a high environmental cost and offer few operational jobs Sun 21 Jun 2026 11.00 EDTLast modified on Sun 21 Jun 2026 11.01 EDT On Mamre Road, in Sydney's outer western suburbs, there are plans to build a "hyperscale" datacentre that will be one of the biggest in the world. If approved, the 52-hectare site will include six four-storey buildings that stretch 40 metres high, alongside 936 cooling units and 852 diesel backup power generators. The Mamre Road project is part of an estimated $155bn investment pipeline over the coming decade, amid a worldwide rush to build the infrastructure enabling the artificial intelligence revolution.


Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation

Neural Information Processing Systems

We introduce Skeleton-Cache, the first training-free test-time adaptation framework for skeleton-based zero-shot action recognition (SZAR), aimed at improving model generalization to unseen actions during inference. Skeleton-Cache reformulates inference as a lightweight retrieval process over a non-parametric cache that stores structured skeleton representations, combining both global and fine-grained local descriptors. To guide the fusion of descriptor-wise predictions, we leverage the semantic reasoning capabilities of large language models (LLMs) to assign classspecific importance weights. By integrating these structured descriptors with LLMguided semantic priors, Skeleton-Cache dynamically adapts to unseen actions without any additional training or access to training data. Extensive experiments on NTURGB+D 60/120 and PKU-MMDII demonstrate that Skeleton-Cache consistently boosts the performance of various SZAR backbones under both zeroshot and generalized zero-shot settings.


Australia and the U.S. May Be Allies, But Expect a World Cup Tussle

TIME - Tech

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?


Addressing Mark Imbalance in Integration-free Neural Marked Temporal Point Processes

Neural Information Processing Systems

Marked Temporal Point Process (MTPP) has been well studied to model the event distribution in marked event streams, which can be used to predict the mark and arrival time of the next event. However, existing studies overlook that the distribution of event marks is highly imbalanced in many real-world applications, with some marks being frequent but others rare. The imbalance poses a significant challenge to the performance of the next event prediction, especially for events of rare marks. To address this issue, we propose a thresholding method, which learns thresholds to tune the mark probability normalized by the mark's prior probability to optimize mark prediction, rather than predicting the mark directly based on the mark probability as in existing studies. In conjunction with this method, we predict the mark first and then the time. In particular, we develop a novel neural MTPP model to support effective time sampling and estimation of mark probability without computationally expensive numerical improper integration. Extensive experiments on real-world datasets demonstrate the superior performance of our solution against various baselines for the next event mark and time prediction.


Representation Consistency for Accurate and Coherent LLMAnswer Aggregation

Neural Information Processing Systems

Test-time scaling improves large language models' (LLMs) performance by allocating more compute budget during inference. To achieve this, existing methods often require intricate modifications to prompting and sampling strategies. In this work, we introduce representation consistency (RC), a test-time scaling method for aggregating answers drawn from multiple candidate responses of an LLM regardless of how they were generated, including variations in prompt phrasing and sampling strategy. RC enhances answer aggregation by not only considering the number of occurrences of each answer in the candidate response set, but also the consistency of the model's internal activations while generating the set of responses leading to each answer. These activations can be either dense (raw model activations) or sparse (encoded via pretrained sparse autoencoders). Our rationale is that if the model's representations of multiple responses converging on the same answer are highly variable, this answer is more likely to be the result of incoherent reasoning and should be down-weighted during aggregation. Importantly, our method only uses cached activations and lightweight similarity computations and requires no additional model queries.


Feeding Kids Eggs Early in Life Helps Prevent Food Allergy, New Study Says

TIME - Tech

Follow this section to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW? Smart Alerts: Get notified about major news as it happens. Follow this tag to personalize your feed and get instant alerts. Follow Go to your personalized feed WHY FOLLOW?


ACT as Human: Multimodal Large Language Model Data Annotation with Critical Thinking

Neural Information Processing Systems

Supervised learning relies on high-quality labeled data, but obtaining such data through human annotation is both expensive and time-consuming. Recent work explores using large language models (LLMs) for annotation, but LLM-generated labels still fall short of human-level quality. To address this problem, we propose the Annotation with Critical Thinking (ACT) data pipeline, where LLMs serve not only as annotators but also as judges to critically identify potential errors. Human effort is then directed towards reviewing only the most "suspicious" cases, significantly improving the human annotation efficiency. Our major contributions are as follows: (1) ACT is applicable to a wide range of domains, including natural language processing (NLP), computer vision (CV), and multimodal understanding, by leveraging multimodal-LLMs (MLLMs).


Andrew Hastie compares AI to cold-war nuclear arms race and warns Australia may fall behind

The Guardian

Andrew Hastie has said the education system should be overhauled so'we can unleash Australian hearts and minds on AI'. Andrew Hastie has said the education system should be overhauled so'we can unleash Australian hearts and minds on AI'. Liberal MP says Australia risks sovereignty and strategic independence being'constrained by the AI superpowers reshaping the global order' Liberal MP Andrew Hastie says Australia should dramatically scale up investment in artificial intelligence to preserve strategic independence and warns the country risks being "a supplicant state" tethered to the US in an era of possible hot conflict with China. In a major address to Liberal members in Sydney on Monday night, the shadow minister for industry and sovereign capability likened the development of AI to the nuclear arms race of the cold-war era and proposed Australia position itself as a technology hub in the southern hemisphere. Delivering the annual Tom Hughes Oration, Hastie called for a new AI ambassador to be appointed and said the education system should be overhauled "so we can unleash Australian hearts and minds on AI". He said prime ministers, including Robert Menzies and John Gorton, had wrestled with the question of Australia pursuing nuclear capability, but ultimately aligned our security settings with Washington.


Canada proposes teen social media ban - with workaround for tech firms

BBC News

Canada is proposing a social media ban for children and teenagers under the age of 16, mirroring a similar law passed in Australia late last year. But unlike Australia's law, tech firms could sidestep Canada's ban if they demonstrate they have policies to minimise harm to minors. The law includes sweeping measures to regulate AI chatbots and curtail harmful content online. It would create a regulator to ensure tech firms comply. Some free speech groups have warned it would expand censorship.