Goto

Collaborating Authors

 Oceania Government


Designing Speech Technologies for Australian Aboriginal English: Opportunities, Risks and Participation

arXiv.org Artificial Intelligence

In Australia, post-contact language varieties, including creoles and local varieties of international languages, emerged as a result of forced contact between Indigenous communities and English speakers. These contact varieties are widely used, yet are poorly supported by language technologies. This gap presents barriers to participation in civil and economic society for Indigenous communities using these varieties, and reproduces minoritisation of contemporary Indigenous sociolinguistic identities. This paper concerns three questions regarding this context. First, can speech technologies support speakers of Australian Aboriginal English, a local indigenised variety of English? Second, what risks are inherent in such a project? Third, what technology development practices are appropriate for this context, and how can researchers integrate meaningful community participation in order to mitigate risks? We argue that opportunities do exist -- as well as risks -- and demonstrate this through a case study exploring design practices in a real-world project aiming to improve speech technologies for Australian Aboriginal English. We discuss how we integrated culturally appropriate and participatory processes throughout the project. We call for increased support for languages used by Indigenous communities, including contact varieties, which provide practical economic and socio-cultural benefits, provided that participatory and culturally safe practices are enacted.


Unveiling AI's Threats to Child Protection: Regulatory efforts to Criminalize AI-Generated CSAM and Emerging Children's Rights Violations

arXiv.org Artificial Intelligence

This paper aims to present new alarming trends in the field of child sexual abuse through imagery, as part of SafeLine's research activities in the field of cybercrime, child sexual abuse material and the protection of children's rights to safe online experiences. It focuses primarily on the phenomenon of AI-generated CSAM, sophisticated ways employed for its production which are discussed in dark web forums and the crucial role that the open-source AI models play in the evolution of this overwhelming phenomenon. The paper's main contribution is a correlation analysis between the hotline's reports and domain names identified in dark web forums, where users' discussions focus on exchanging information specifically related to the generation of AI-CSAM. The objective was to reveal the close connection of clear net and dark web content, which was accomplished through the use of the ATLAS dataset of the Voyager system. Furthermore, through the analysis of a set of posts' content drilled from the above dataset, valuable conclusions on forum members' techniques employed for the production of AI-generated CSAM are also drawn, while users' views on this type of content and routes followed in order to overcome technological barriers set with the aim of preventing malicious purposes are also presented. As the ultimate contribution of this research, an overview of the current legislative developments in all country members of the INHOPE organization and the issues arising in the process of regulating the AI- CSAM is presented, shedding light in the legal challenges regarding the regulation and limitation of the phenomenon.


Australia bans DeepSeek from government tech, citing security

The Japan Times

Australia has banned DeepSeek AI services from all government systems and devices, becoming one of the first countries to take direct action against a Chinese artificial intelligence startup that shook Silicon Valley and global markets this year. Home Affairs Minister Tony Burke said in a statement Tuesday that all DeepSeek products, applications and services would be removed from government systems on national security grounds effective immediately. A threat assessment by the country's intelligence agencies found the technology posed an unacceptable risk, he said. Founded in Hangzhou only 20 months ago, DeepSeek's technology made waves in January with a new mobile app featuring its reasoning AI chatbot -- which articulates its approximation of thought process and research before delivering a response -- that seemed to suggest top-tier AI could be developed without huge investments in hardware. Its appeal took it to the top of worldwide download charts.


DeepSeek banned from Australian government devices over national security concerns

The Guardian

DeepSeek will be banned from all federal government devices as the Albanese government cracks down on the Chinese AI chatbot, citing unspecified national security risks. The launch of DeepSeek's AI generative chatbot rocked US tech stocks last week amid concerns over censorship and data security. The home affairs department secretary signed a directive on Tuesday banning the program from all federal government systems and devices on national security grounds after advice from intelligence agencies that it poses an unacceptable risk. The home affairs minister, Tony Burke, said the decision was not impacted by the app's country of origin – China – but by its risk to the government and its assets. "The Albanese government is taking swift and decisive action to protect Australia's national security and national interest," Burke said.


Methods for Legal Citation Prediction in the Age of LLMs: An Australian Law Case Study

arXiv.org Artificial Intelligence

In recent years, Large Language Models (LLMs) have shown great potential across a wide range of legal tasks. Despite these advances, mitigating hallucination remains a significant challenge, with state-of-the-art LLMs still frequently generating incorrect legal references. In this paper, we focus on the problem of legal citation prediction within the Australian law context, where correctly identifying and citing relevant legislations or precedents is critical. We compare several approaches: prompting general purpose and law-specialised LLMs, retrieval-only pipelines with both generic and domain-specific embeddings, task-specific instruction-tuning of LLMs, and hybrid strategies that combine LLMs with retrieval augmentation, query expansion, or voting ensembles. Our findings indicate that domain-specific pre-training alone is insufficient for achieving satisfactory citation accuracy even after law-specialised pre-training. In contrast, instruction tuning on our task-specific dataset dramatically boosts performance reaching the best results across all settings. We also highlight that database granularity along with the type of embeddings play a critical role in the performance of retrieval systems. Among retrieval-based approaches, hybrid methods consistently outperform retrieval-only setups, and among these, ensemble voting delivers the best result by combining the predictive quality of instruction-tuned LLMs with the retrieval system.


Spatioformer: A Geo-encoded Transformer for Large-Scale Plant Species Richness Prediction

arXiv.org Artificial Intelligence

Earth observation data have shown promise in predicting species richness of vascular plants ($\alpha$-diversity), but extending this approach to large spatial scales is challenging because geographically distant regions may exhibit different compositions of plant species ($\beta$-diversity), resulting in a location-dependent relationship between richness and spectral measurements. In order to handle such geolocation dependency, we propose Spatioformer, where a novel geolocation encoder is coupled with the transformer model to encode geolocation context into remote sensing imagery. The Spatioformer model compares favourably to state-of-the-art models in richness predictions on a large-scale ground-truth richness dataset (HAVPlot) that consists of 68,170 in-situ richness samples covering diverse landscapes across Australia. The results demonstrate that geolocational information is advantageous in predicting species richness from satellite observations over large spatial scales. With Spatioformer, plant species richness maps over Australia are compiled from Landsat archive for the years from 2015 to 2023. The richness maps produced in this study reveal the spatiotemporal dynamics of plant species richness in Australia, providing supporting evidence to inform effective planning and policy development for plant diversity conservation. Regions of high richness prediction uncertainties are identified, highlighting the need for future in-situ surveys to be conducted in these areas to enhance the prediction accuracy.


Bushfire Severity Modelling and Future Trend Prediction Across Australia: Integrating Remote Sensing and Machine Learning

arXiv.org Artificial Intelligence

Bushfire is one of the major natural disasters that cause huge losses to livelihoods and the environment. Understanding and analyzing the severity of bushfires is crucial for effective management and mitigation strategies, helping to prevent the extensive damage and loss caused by these natural disasters. This study presents an in-depth analysis of bushfire severity in Australia over the last twelve years, combining remote sensing data and machine learning techniques to predict future fire trends. By utilizing Landsat imagery and integrating spectral indices like NDVI, NBR, and Burn Index, along with topographical and climatic factors, we developed a robust predictive model using XGBoost. The model achieved high accuracy, 86.13%, demonstrating its effectiveness in predicting fire severity across diverse Australian ecosystems. By analyzing historical trends and integrating factors such as population density and vegetation cover, we identify areas at high risk of future severe bushfires. Additionally, this research identifies key regions at risk, providing data-driven recommendations for targeted firefighting efforts. The findings contribute valuable insights into fire management strategies, enhancing resilience to future fire events in Australia. Also, we propose future work on developing a UAV-based swarm coordination model to enhance fire prediction in real-time and firefighting capabilities in the most vulnerable regions.


Machine Learning to Detect Anxiety Disorders from Error-Related Negativity and EEG Signals

arXiv.org Artificial Intelligence

Anxiety is endemic to every person, with an occurrence rate of approximately 20% [World Health Organization, 2017]. Between 2020 and 2022, over one in six people (17.2% or 3.4 million people) aged 16 to 85 years experienced an anxiety disorder [Australian Bureau of Statistics]. Anxiety is caused by changes in the situation, nervousness and common symptoms, including sweating, trembling and excessive worrying, which affect a person's daily life. Anxiety disorders encompass a range of conditions, such as generalised anxiety disorder (GAD), panic disorder (PD), social anxiety disorder (SAD), obsessive-compulsive disorder (OCD), various phobia-related disorders, physical pain related protective behaviour [Li et al., 2020, 2021] and depression [Ghosh and Anwar, 2021]. Current clinical approaches for diagnosing these disorders often suffer from limitations in accuracy and objectivity, relying heavily on self-reports, patient histories and clinical observations. These methods can be subjective and may not capture the nuanced neural and behavioural patterns associated with anxiety, leading to potential misdiagnoses. Recent research has shown promising results in using machine learning techniques to detect anxiety through physiological analysis [Abd-Alrazaq et al., 2023], such as respiration, electrocardiogram (ECG), photoplethysmography (PPG), electrodermal response (EDA) and electroencephalography (EEG), to identify patterns associated with anxiety states [Abd-Alrazaq et al., 2023].


Real-Time Energy Pricing in New Zealand: An Evolving Stream Analysis

arXiv.org Artificial Intelligence

This paper introduces a group of novel datasets representing real-time time-series and streaming data of energy prices in New Zealand, sourced from the Electricity Market Information (EMI) website maintained by the New Zealand government. The datasets are intended to address the scarcity of proper datasets for streaming regression learning tasks. We conduct extensive analyses and experiments on these datasets, covering preprocessing techniques, regression tasks, prediction intervals, concept drift detection, and anomaly detection. Our experiments demonstrate the datasets' utility and highlight the challenges and opportunities for future research in energy price forecasting.


Spatial Temporal Approach for High-Resolution Gridded Wind Forecasting across Southwest Western Australia

arXiv.org Artificial Intelligence

Accurate forecasting of wind speed and direction is paramount across various domains, playing a pivotal role in weather prediction, renewable energy generation, agricultural management, and bushfire mitigation efforts. Accurate predictions enable meteorologists to deepen their understanding of atmospheric processes, leading to more precise weather forecasts and timely alerts for severe weather events [1]. In the realm of renewable energy, precise forecasts of wind conditions are indispensable to optimise the performance of wind farms and integrate wind energy efficiently into the power grid [2-4]. In agriculture, wind forecasts inform critical decisions such as crop spraying, sprinkler or central pivot irrigation timing, and pest control, ultimately improving crop yields and water management [5]. For bush-fire management, timely and accurate predictions of wind speed and direction are crucial for modelling fire behaviour, planning firefighter deployment, and planning evacuations, thereby reducing the impact of bushfires on communities and ecosystems [6, 7]. Given the multifaceted applications of wind forecasting, advancements in machine learning-based techniques for predicting wind speed and direction hold immense promise for bolstering societal resilience and fostering sustainable development. Traditionally, wind forecasting models fall into three categories: physical, statistical time series analysis and machine learning.