Oceania
The Great Language Flattening
In at least one crucial way, AI has already won its campaign for global dominance. An unbelievable volume of synthetic prose is published every moment of every day--heaping piles of machine-written news articles, text messages, emails, search results, customer-service chats, even scientific research. Chatbots learned from human writing. Now the influence may run in the other direction. Some people have hypothesized that the proliferation of generative-AI tools such as ChatGPT will seep into human communication, that the terse language we use when prompting a chatbot may lead us to dispose of any niceties or writerly flourishes when corresponding with friends and colleagues.
Bearing-Only Tracking and Circumnavigation of a Fast Time-Varied Velocity Target Utilising an LSTM
Torok, Mitchell, Deghat, Mohammad, Song, Yang
-- Bearing-only tracking, localisation, and circumnavigation is a problem in which a single or a group of agents attempts to track a target while circumnavigating it at a fixed distance using only bearing measurements. While previous studies have addressed scenarios involving stationary targets or those moving with an unknown constant velocity, the challenge of accurately tracking a target moving with a time-varying velocity remains open. This paper presents an approach utilising a Long Short-T erm Memory (LSTM) based estimator for predicting the target's position and velocity. We also introduce a corresponding control strategy. When evaluated against previously proposed estimation and circumnavigation approaches, our approach demonstrates significantly lower control and estimation errors across various time-varying velocity scenarios. Additionally, we illustrate the effectiveness of the proposed method in tracking targets with a double integrator nonholonomic system dynamics that mimic real-world systems. Target localisation and tracking is a problem in which a single or group of agents is tasked with estimating the position and following a target over time. This task becomes particularly complex when dealing with an uncooperative target whose state information is not directly accessible to the agent(s).
Deep Learning-Based Multi-Modal Fusion for Robust Robot Perception and Navigation
Lai, Delun, Zhang, Yeyubei, Liu, Yunchong, Li, Chaojie, Mo, Huadong
-- This paper introduces a novel deep learning - based multimodal fusion architecture aimed at enhancing the perception capabilities of autonomous navigation robots in complex environments. The key contributions of this work are as follows: a. the design of a lightweight feature extraction network to enhance feature representati on; b. the development of an adaptive weighted cross - modal fusion strategy to improve system robustness; and c. the incorporation of time - series information modeling to boost dynamic scene perception accuracy. Experimental results on the KITTI dataset demonstrate that the proposed approach increases navigation and positioning accuracy by 3.5% and 2.2%, respectively, while maintaining real - time performance. This work provides a novel solution for autonomous robot navigation in complex environments. With the rapid development of robotics, autonomous navigation capability has become a core requirement for robotic systems. In practical applications, robots need to accurately perceive and reliably navigate in dynamically changing environments.
IoT Botnet Detection: Application of Vision Transformer to Classification of Network Flow Traffic
Wasswa, Hassan, Lynar, Timothy, Nanyonga, Aziida, Abbass, Hussein
Despite the demonstrated effectiveness of transformer models in NLP, and image and video classification, the available tools for extracting features from captured IoT network flow packets fail to capture sequential patterns in addition to the absence of spatial patterns consequently limiting transformer model application. This work introduces a novel preprocessing method to adapt transformer models, the vision transformer (ViT) in particular, for IoT botnet attack detection using network flow packets. The approach involves feature extraction from .pcap files and transforming each instance into a 1-channel 2D image shape, enabling ViT-based classification. Also, the ViT model was enhanced to allow use any classifier besides Multilayer Perceptron (MLP) that was deployed in the initial ViT paper. Models including the conventional feed forward Deep Neural Network (DNN), LSTM and Bidirectional-LSTM (BLSTM) demonstrated competitive performance in terms of precision, recall, and F1-score for multiclass-based attack detection when evaluated on two IoT attack datasets.
A Study on Mixup-Inspired Augmentation Methods for Software Vulnerability Detection
Daneshvar, Seyed Shayan, Tan, Da, Wang, Shaowei, Leung, Carson
Various deep learning (DL) methods have recently been utilized to detect software vulnerabilities. Real-world software vulnerability datasets are rare and hard to acquire, as there is no simple metric for classifying vulnerability. Such datasets are heavily imbalanced, and none of the current datasets are considered huge for DL models. To tackle these problems, a recent work has tried to augment the dataset using the source code and generate realistic single-statement vulnerabilities, which is not quite practical and requires manual checking of the generated vulnerabilities. In this paper, we aim to explore the augmentation of vulnerabilities at the representation level to help current models learn better, which has never been done before to the best of our knowledge. We implement and evaluate five augmentation techniques that augment the embedding of the data and have recently been used for code search, which is a completely different software engineering task. We also introduced a conditioned version of those augmentation methods, which ensures the augmentation does not change the vulnerable section of the vector representation. We show that such augmentation methods can be helpful and increase the F1-score by up to 9.67%, yet they cannot beat Random Oversampling when balancing datasets, which increases the F1-score by 10.82%.
Commissioner calls for ban on apps that make deepfake nude images of children
Artificial intelligence "nudification" apps that create deepfake sexual images of children should be immediately banned, amid growing fears among teenage girls that they could fall victim, the children's commissioner for England is warning. Girls said they were stopping posting images of themselves on social media out of a fear that generative AI tools could be used to digitally remove their clothes or sexualise them, according to the commissioner's report on the tools, drawing on children's experiences. Although it is illegal to create or share a sexually explicit image of a child, the technology enabling them remains legal, the report noted. "Children have told me they are frightened by the very idea of this technology even being available, let alone used. They fear that anyone โ a stranger, a classmate, or even a friend โ could use a smartphone as a way of manipulating them by creating a naked image using these bespoke apps," the commissioner, Dame Rachel de Souza, said.
Rethinking Label-specific Features for Label Distribution Learning
Xu, Suping, Dai, Chuyi, Shang, Lin, Shao, Changbin, Yang, Xibei, Pedrycz, Witold
--Label distribution learning (LDL) is an emerging learning paradigm designed to capture the relative importance of labels for each instance. Label-specific features (LSFs), constructed by LIFT, have proven effective for learning tasks with label ambiguity by leveraging clustering-based prototypes for each label to re-characterize instances. However, directly introducing LIFT into LDL tasks can be suboptimal, as the prototypes it collects primarily reflect intra-cluster relationships while neglecting interactions among distinct clusters. Additionally, constructing LSFs using multi-perspective information, rather than relying solely on Euclidean distance, provides a more robust and comprehensive representation of instances, mitigating noise and bias that may arise from a single distance perspective. T o address these limitations, we introduce Structural Anchor Points (SAPs) to capture inter-cluster interactions. This leads to a novel LSFs construction strategy, LIFT -SAP, which enhances LIFT by integrating both distance and direction information of each instance relative to SAPs. Furthermore, we propose a novel LDL algorithm, Label Distribution Learning via Label-specifIc FeaT ure with SAPs (LDL-LIFT -SAP), which unifies multiple label description degrees predicted from different LSF spaces into a cohesive label distribution. Extensive experiments on 15 real-world datasets demonstrate the effectiveness of LIFT -SAP over LIFT, as well as the superiority of LDL-LIFT -SAP compared to seven other well-established algorithms. Index T erms --Label distribution learning, Label-specific features, Structural anchor points, Prototypes, Direction information.
Factor Analysis with Correlated Topic Model for Multi-Modal Data
ลazฤcka, Maลgorzata, Szczurek, Ewa
Integrating various data modalities brings valuable insights into underlying phenomena. Multimodal factor analysis (FA) uncovers shared axes of variation underlying different simple data modalities, where each sample is represented by a vector of features. However, FA is not suited for structured data modalities, such as text or single cell sequencing data, where multiple data points are measured per each sample and exhibit a clustering structure. To overcome this challenge, we introduce FACTM, a novel, multi-view and multi-structure Bayesian model that combines FA with correlated topic modeling and is optimized using variational inference. Additionally, we introduce a method for rotating latent factors to enhance interpretability with respect to binary features. On text and video benchmarks as well as real-world music and COVID-19 datasets, we demonstrate that FACTM outperforms other methods in identifying clusters in structured data, and integrating them with simple modalities via the inference of shared, interpretable factors.
Fried Parameter Estimation from Single Wavefront Sensor Image with Artificial Neural Networks
Smith, Jeffrey, Fujii, Taisei, Cranney, Jesse, Gretton, Charles
Atmospheric turbulence degrades the quality of astronomical observations in ground-based telescopes, leading to distorted and blurry images. Adaptive Optics (AO) systems are designed to counteract these effects, using atmospheric measurements captured by a wavefront sensor to make real-time corrections to the incoming wavefront. The Fried parameter, r0, characterises the strength of atmospheric turbulence and is an essential control parameter for optimising the performance of AO systems and more recently sky profiling for Free Space Optical (FSO) communication channels. In this paper, we develop a novel data-driven approach, adapting machine learning methods from computer vision for Fried parameter estimation from a single Shack-Hartmann or pyramid wavefront sensor image. Using these data-driven methods, we present a detailed simulation-based evaluation of our approach using the open-source COMPASS AO simulation tool to evaluate both the Shack-Hartmann and pyramid wavefront sensors. Our evaluation is over a range of guide star magnitudes, and realistic noise, atmospheric and instrument conditions. Remarkably, we are able to develop a single network-based estimator that is accurate in both open and closed-loop AO configurations. Our method accurately estimates the Fried parameter from a single WFS image directly from AO telemetry to a few millimetres. Our approach is suitable for real time control, exhibiting 0.83ms r0 inference times on retail NVIDIA RTX 3090 GPU hardware, and thereby demonstrating a compelling economic solution for use in real-time instrument control.
Proof of Useful Intelligence (PoUI): Blockchain Consensus Beyond Energy Waste
Chong, Zan-Kai, Ohsaki, Hiroyuki, Ng, Bryan
Blockchain technology enables secure, transparent data management in decentralized systems, supporting applications from cryptocurrencies like Bitcoin to tokenizing real-world assets like property. Its scalability and sustainability hinge on consensus mechanisms balancing security and efficiency. Proof of Work (PoW), used by Bitcoin, ensures security through energy-intensive computations but demands significant resources. Proof of Stake (PoS), as in Ethereum post-Merge, selects validators based on staked cryptocurrency, offering energy efficiency but risking centralization from wealth concentration. With AI models straining computational resources, we propose Proof of Useful Intelligence (PoUI), a hybrid consensus mechanism. In PoUI, workers perform AI tasks like language processing or image analysis to earn coins, which are staked to secure the network, blending security with practical utility. Decentralized nodes--job posters, market coordinators, workers, and validators --collaborate via smart contracts to manage tasks and rewards.