Oceania
YouTube should not be exempt from Australia's under-16s social media ban, eSafety commissioner says
YouTube should be included in the ban on under-16s accessing social media, the nation's online safety chief has said as she urges the Albanese government to rethink its decision to carve out the video sharing platform from new rules which apply to apps such as TikTok, Snapchat and Instagram. The eSafety commissioner, Julie Inman Grant, also recommended the government update its under-16s social media ban to specifically address features such as stories, streaks and AI chatbots which can disproportionately pose risk to young people. The under-16s ban will come into effect in December 2025, despite questions over how designated online platforms would verify users' ages, and the government's own age assurance trial reporting last week that current technology is not "guaranteed to be effective" and face-scanning tools have given incorrect results. Although then communications minister Michelle Rowland initially indicated YouTube would be part of the ban legislated in December 2024, the regulations specifically exempted the Google-owned video site. Guardian Australia revealed YouTube's global chief executive personally lobbied Rowland for an exemption shortly before she announced the carve out.
Four killed in Kyiv in new Russian aerial attack
Four killed in Kyiv in new Russian aerial attack 12 minutes agoShareSaveJaroslav LukivBBC NewsShareSaveUkraine's emergencies service DSNSRescuers from Ukraine's emergencies service DSNS tackle fire in a residential building destroyed in the latest Russian attack on Kyiv At least four people have been killed in an overnight Russian missile and drone attack on Ukraine's capital Kyiv, the interior minister says. In a post on social media, Ihor Klymenko says residential areas, hospitals and sports infrastructure were hit. "An entire section of a residential high-rise building was destroyed" in the worst-hit Shevchenkivskyi district, he says, adding that some people are trapped under the rubble. In the Kyiv region, a woman was killed and another two people injured in the Russian aerial attack, regional head Mykola Kalashnyk says. The Russian military has not commented on the issue.
The Role of Explanation Styles and Perceived Accuracy on Decision Making in Predictive Process Monitoring
Chae, Soobin, Lee, Suhwan, Hauptmann, Hanna, Reijers, Hajo A., Lu, Xixi
Predictive Process Monitoring (PPM) often uses deep learning models to predict the future behavior of ongoing processes, such as predicting process outcomes. While these models achieve high accuracy, their lack of interpretability undermines user trust and adoption. Explainable AI (XAI) aims to address this challenge by providing the reasoning behind the predictions. However, current evaluations of XAI in PPM focus primarily on functional metrics (such as fidelity), overlooking user-centered aspects such as their effect on task performance and decision-making. This study investigates the effects of explanation styles (feature importance, rule-based, and counterfactual) and perceived AI accuracy (low or high) on decision-making in PPM. We conducted a decision-making experiment, where users were presented with the AI predictions, perceived accuracy levels, and explanations of different styles. Users' decisions were measured both before and after receiving explanations, allowing the assessment of objective metrics (Task Performance and Agreement) and subjective metrics (Decision Confidence). Our findings show that perceived accuracy and explanation style have a significant effect.
ParkFormer: A Transformer-Based Parking Policy with Goal Embedding and Pedestrian-Aware Control
Fu, Jun, Tian, Bin, Chen, Haonan, Meng, Shi, Yao, Tingting
Autonomous parking plays a vital role in intelligent vehicle systems, particularly in constrained urban environments where high-precision control is required. While traditional rule-based parking systems struggle with environmental uncertainties and lack adaptability in crowded or dynamic scenes, human drivers demonstrate the ability to park intuitively without explicit modeling. Inspired by this observation, we propose a Transformer-based end-to-end framework for autonomous parking that learns from expert demonstrations. The network takes as input surround-view camera images, goal-point representations, ego vehicle motion, and pedestrian trajectories. It outputs discrete control sequences including throttle, braking, steering, and gear selection. A novel cross-attention module integrates BEV features with target points, and a GRU-based pedestrian predictor enhances safety by modeling dynamic obstacles. We validate our method on the CARLA 0.9.14 simulator in both vertical and parallel parking scenarios. Experiments show our model achieves a high success rate of 96.57\%, with average positional and orientation errors of 0.21 meters and 0.41 degrees, respectively. The ablation studies further demonstrate the effectiveness of key modules such as pedestrian prediction and goal-point attention fusion. The code and dataset will be released at: https://github.com/little-snail-f/ParkFormer.
Reliable Few-shot Learning under Dual Noises
Zhang, Ji, Song, Jingkuan, Gao, Lianli, Sebe, Nicu, Shen, Heng Tao
Recent advances in model pre-training give rise to task adaptation-based few-shot learning (FSL), where the goal is to adapt a pre-trained task-agnostic model for capturing task-specific knowledge with a few-labeled support samples of the target task.Nevertheless, existing approaches may still fail in the open world due to the inevitable in-distribution (ID) and out-of-distribution (OOD) noise from both support and query samples of the target task. With limited support samples available, i) the adverse effect of the dual noises can be severely amplified during task adaptation, and ii) the adapted model can produce unreliable predictions on query samples in the presence of the dual noises. In this work, we propose DEnoised Task Adaptation (DETA++) for reliable FSL. DETA++ uses a Contrastive Relevance Aggregation (CoRA) module to calculate image and region weights for support samples, based on which a clean prototype loss and a noise entropy maximization loss are proposed to achieve noise-robust task adaptation. Additionally,DETA++ employs a memory bank to store and refine clean regions for each inner-task class, based on which a Local Nearest Centroid Classifier (LocalNCC) is devised to yield noise-robust predictions on query samples. Moreover, DETA++ utilizes an Intra-class Region Swapping (IntraSwap) strategy to rectify ID class prototypes during task adaptation, enhancing the model's robustness to the dual noises. Extensive experiments demonstrate the effectiveness and flexibility of DETA++.
Bandwidth Selectors on Semiparametric Bayesian Networks
Alejandre, Victor, Bielza, Concha, Larrañaga, Pedro
Semiparametric Bayesian networks (SPBNs) integrate parametric and non-parametric probabilistic models, offering flexibility in learning complex data distributions from samples. In particular, kernel density estimators (KDEs) are employed for the non-parametric component. Under the assumption of data normality, the normal rule is used to learn the bandwidth matrix for the KDEs in SPBNs. This matrix is the key hyperparameter that controls the trade-off between bias and variance. However, real-world data often deviates from normality, potentially leading to suboptimal density estimation and reduced predictive performance. This paper first establishes the theoretical framework for the application of state-of-the-art bandwidth selectors and subsequently evaluates their impact on SPBN performance. We explore the approaches of cross-validation and plug-in selectors, assessing their effectiveness in enhancing the learning capability and applicability of SPBNs. To support this investigation, we have extended the open-source package PyBNesian for SPBNs with the additional bandwidth selection techniques and conducted extensive experimental analyses. Our results demonstrate that the proposed bandwidth selectors leverage increasing information more effectively than the normal rule, which, despite its robustness, stagnates with more data. In particular, unbiased cross-validation generally outperforms the normal rule, highlighting its advantage in high sample size scenarios.
Multi-Armed Bandits With Machine Learning-Generated Surrogate Rewards
Ji, Wenlong, Pan, Yihan, Zhu, Ruihao, Lei, Lihua
Multi-armed bandit (MAB) is a widely adopted framework for sequential decision-making under uncertainty. Traditional bandit algorithms rely solely on online data, which tends to be scarce as it must be gathered during the online phase when the arms are actively pulled. However, in many practical settings, rich auxiliary data, such as covariates of past users, is available prior to deploying any arms. We introduce a new setting for MAB where pre-trained machine learning (ML) models are applied to convert side information and historical data into \emph{surrogate rewards}. A prominent feature of this setting is that the surrogate rewards may exhibit substantial bias, as true reward data is typically unavailable in the offline phase, forcing ML predictions to heavily rely on extrapolation. To address the issue, we propose the Machine Learning-Assisted Upper Confidence Bound (MLA-UCB) algorithm, which can be applied to any reward prediction model and any form of auxiliary data. When the predicted and true rewards are jointly Gaussian, it provably improves the cumulative regret, provided that the correlation is non-zero -- even in cases where the mean surrogate reward completely misaligns with the true mean rewards. Notably, our method requires no prior knowledge of the covariance matrix between true and surrogate rewards. We compare MLA-UCB with the standard UCB on a range of numerical studies and show a sizable efficiency gain even when the size of the offline data and the correlation between predicted and true rewards are moderate.
Refined Causal Graph Structure Learning via Curvature for Brain Disease Classification
Febrinanto, Falih Gozi, Simango, Adonia, Xu, Chengpei, Zhou, Jingjing, Ma, Jiangang, Tyagi, Sonika, Xia, Feng
The field of neuroscience has been revolutionized by the advent of brain imaging technologies, particularly functional magnetic resonance imaging in the resting state (rest fMRI) (Khalilullah et al, 2023; Vasilkovska et al, 2023; Liu et al, 2024). This powerful tool allows the measurement of blood-oxygen-level-dependent (BOLD) signals in predefined Regions of Interest (ROIs) within the brain, offering an unprecedented avenue for revealing information about potential diseases such as autism spectrum disorder (ASD) and schizophrenia (Philiastides et al, 2021; Kocak, 2021). Various brain atlases, including Harvard-Oxford (Makris et al, 2006) and Craddock 200 (Craddock et al, 2012) parcellations, have been used to define these ROIs. Furthermore, ROIs can be interestingly modelled as graph data, where the ROIs themselves represent nodes, and the connections between ROIs represent edges of graphs (Cui et al, 2022b). This graph-based data structure, inheriting the graph theory technique, has been instrumental in revealing meaningful relationships between ROIs in brain networks to diagnose brain diseases more effectively (Alsubaie et al, 2024; Ren and Xia, 2024). With the current popularity of deep learning, recent frameworks have developed graph neural networks (GNNs) (Xia et al, 2021; Febrinanto et al, 2023c) to extend the merits of modelling graph-structured data for detecting brain diseases with brain networks based on fMRI signals as input (Kan et al, 2022b; Li et al, 2021; Kan et al, 2022a; Cui et al, 2022a; ElGazzar et al, 2022; Febrinanto et al, 2023a). These techniques perform more accurately than typical machine learning or deep learning techniques. However, there is still a high consensus on how to construct or define an appropriate graph structure in brain networks in terms of two processes: 1) how do we generate the graphs?
Structured Program Synthesis using LLMs: Results and Insights from the IPARC Challenge
Surana, Shraddha, Srinivasan, Ashwin, Bain, Michael
The IPARC Challenge, inspired by ARC, provides controlled program synthesis tasks over synthetic images to evaluate automatic program construction, focusing on sequence, selection, and iteration. This set of 600 tasks has resisted automated solutions. This paper presents a structured inductive programming approach with LLMs that successfully solves tasks across all IPARC categories. The controlled nature of IPARC reveals insights into LLM-based code generation, including the importance of prior structuring, LLMs' ability to aid structuring (requiring human refinement), the need to freeze correct code, the efficiency of code reuse, and how LLM-generated code can spark human creativity. These findings suggest valuable mechanisms for human-LLM collaboration in tackling complex program synthesis.
Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM
Li, Zinuo, Zhang, Xian, Guo, Yongxin, Bennamoun, Mohammed, Boussaid, Farid, Dwivedi, Girish, Gong, Luqi, Ke, Qiuhong
Humans naturally understand moments in a video by integrating visual and auditory cues. For example, localizing a scene in the video like "A scientist passionately speaks on wildlife conservation as dramatic orchestral music plays, with the audience nodding and applauding" requires simultaneous processing of visual, audio, and speech signals. However, existing models often struggle to effectively fuse and interpret audio information, limiting their capacity for comprehensive video temporal understanding. To address this, we present TriSense, a triple-modality large language model designed for holistic video temporal understanding through the integration of visual, audio, and speech modalities. Central to TriSense is a Query-Based Connector that adaptively reweights modality contributions based on the input query, enabling robust performance under modality dropout and allowing flexible combinations of available inputs. To support TriSense's multimodal capabilities, we introduce TriSense-2M, a high-quality dataset of over 2 million curated samples generated via an automated pipeline powered by fine-tuned LLMs. TriSense-2M includes long-form videos and diverse modality combinations, facilitating broad generalization. Extensive experiments across multiple benchmarks demonstrate the effectiveness of TriSense and its potential to advance multimodal video analysis. Code and dataset will be publicly released.