Overview
ALPEC: A Comprehensive Evaluation Framework and Dataset for Machine Learning-Based Arousal Detection in Clinical Practice
Kraft, Stefan, Theissler, Andreas, Wienhausen-Wilke, Vera, Walter, Philipp, Kasneci, Gjergji
Detecting arousals in sleep is essential for diagnosing sleep disorders. However, using Machine Learning (ML) in clinical practice is impeded by fundamental issues, primarily due to mismatches between clinical protocols and ML methods. Clinicians typically annotate only the onset of arousals, while ML methods rely on annotations for both the beginning and end. Additionally, there is no standardized evaluation methodology tailored to clinical needs for arousal detection models. This work addresses these issues by introducing a novel post-processing and evaluation framework emphasizing approximate localization and precise event count (ALPEC) of arousals. We recommend that ML practitioners focus on detecting arousal onsets, aligning with clinical practice. We examine the impact of this shift on current training and evaluation schemes, addressing simplifications and challenges. We utilize a novel comprehensive polysomnographic dataset (CPS) that reflects the aforementioned clinical annotation constraints and includes modalities not present in existing polysomnographic datasets. We release the dataset alongside this paper, demonstrating the benefits of leveraging multimodal data for arousal onset detection. Our findings significantly contribute to integrating ML-based arousal detection in clinical settings, reducing the gap between technological advancements and clinical needs.
Recent Advancement of Emotion Cognition in Large Language Models
Emotion cognition in large language models (LLMs) is crucial for enhancing performance across various applications, such as social media, human-computer interaction, and mental health assessment. We explore the current landscape of research, which primarily revolves around emotion classification, emotionally rich response generation, and Theory of Mind assessments, while acknowledge the challenges like dependency on annotated data and complexity in emotion processing. In this paper, we present a detailed survey of recent progress in LLMs for emotion cognition. We explore key research studies, methodologies, outcomes, and resources, aligning them with Ulric Neisser's cognitive stages. Additionally, we outline potential future directions for research in this evolving field, including unsupervised learning approaches and the development of more complex and interpretable emotion cognition LLMs. We also discuss advanced methods such as contrastive learning used to improve LLMs' emotion cognition capabilities.
Towards LifeSpan Cognitive Systems
Wang, Yu, Han, Chi, Wu, Tongtong, He, Xiaoxin, Zhou, Wangchunshu, Sadeq, Nafis, Chen, Xiusi, He, Zexue, Wang, Wei, Haffari, Gholamreza, Ji, Heng, McAuley, Julian
Building a human-like system that continuously interacts with complex environments -- whether simulated digital worlds or human society -- presents several key challenges. Central to this is enabling continuous, high-frequency interactions, where the interactions are termed experiences. We refer to this envisioned system as the LifeSpan Cognitive System (LSCS). A critical feature of LSCS is its ability to engage in incremental and rapid updates while retaining and accurately recalling past experiences. We identify two major challenges in achieving this: (1) Abstraction and Experience Merging, and (2) Long-term Retention with Accurate Recall. These properties are essential for storing new experiences, organizing past experiences, and responding to the environment in ways that leverage relevant historical data. Unlike language models with continual learning, which typically rely on large corpora for fine-tuning and focus on improving performance within specific domains or tasks, LSCS must rapidly and incrementally update with new information from its environment at a high frequency. Existing technologies with the potential of solving the above two major challenges can be classified into four classes based on a conceptual metric called Storage Complexity, which measures the relative space required to store past experiences. Each of these four classes of technologies has its own strengths and limitations. Given that none of the existing technologies can achieve LSCS alone, we propose a novel paradigm for LSCS that integrates all four classes of technologies. The new paradigm operates through two core processes: Absorbing Experiences and Generating Responses.
Relationship between Uncertainty in DNNs and Adversarial Attacks
Adeniran, Abigail, Adeyemo, Adewale
Deep Neural Networks (DNNs) have achieved state of the art results and even outperformed human accuracy in many challenging tasks, leading to DNNs adoption in a variety of fields including natural language processing, pattern recognition, prediction, and control optimization. However, DNNs are accompanied by uncertainty about their results, causing them to predict an outcome that is either incorrect or outside of a certain level of confidence. These uncertainties stem from model or data constraints, which could be exacerbated by adversarial attacks. Adversarial attacks aim to provide perturbed input to DNNs, causing the DNN to make incorrect predictions or increase model uncertainty. In this review, we explore the relationship between DNN uncertainty and adversarial attacks, emphasizing how adversarial attacks might raise DNN uncertainty.
High-dimensional learning of narrow neural networks
Recent years have been marked with the fast-pace diversification and increasing ubiquity of machine learning applications. Yet, a firm theoretical understanding of the surprising efficiency of neural networks to learn from high-dimensional data still proves largely elusive. In this endeavour, analyses inspired by statistical physics have proven instrumental, enabling the tight asymptotic characterization of the learning of neural networks in high dimensions, for a broad class of solvable models. This manuscript reviews the tools and ideas underlying recent progress in this line of work. We introduce a generic model -- the sequence multi-index model -- which encompasses numerous previously studied models as special instances. This unified framework covers a broad class of machine learning architectures with a finite number of hidden units, including multi-layer perceptrons, autoencoders, attention mechanisms; and tasks, including (un)supervised learning, denoising, contrastive learning, in the limit of large data dimension, and comparably large number of samples. We explicate in full detail the analysis of the learning of sequence multi-index models, using statistical physics techniques such as the replica method and approximate message-passing algorithms. This manuscript thus provides a unified presentation of analyses reported in several previous works, and a detailed overview of central techniques in the field of statistical physics of machine learning. This review should be a useful primer for machine learning theoreticians curious of statistical physics approaches; it should also be of value to statistical physicists interested in the transfer of such ideas to the study of neural networks.
Evolution and challenges of computer vision and deep learning technologies for analysing mixed construction and demolition waste
Langley, Adrian, Lonergan, Matthew, Huang, Tao, Azghadi, Mostafa Rahimi
Improving the automatic and timely recognition of construction and demolition waste (C&DW) composition is crucial for enhancing business returns, economic outcomes, and sustainability. Technologies like computer vision, artificial intelligence (AI), robotics, and internet of things (IoT) are increasingly integrated into waste processing to achieve these goals. While deep learning (DL) models show promise in recognising homogeneous C&DW piles, few studies assess their performance with mixed, highly contaminated material in commercial settings. Drawing on extensive experience at a C&DW materials recovery facility (MRF) in Sydney, Australia, we explore the challenges and opportunities in developing an advanced automated mixed C&DW management system. We begin with an overview of the evolution of waste management in the construction industry, highlighting its environmental, economic, and societal impacts. We review various C&DW analysis techniques, concluding that DL-based visual methods are the optimal solution. Additionally, we examine the progression of sensor and camera technologies for C&DW analysis as well as the evolution of DL algorithms focused on object detection and material segmentation. We also discuss C&DW datasets, their curation, and innovative methods for their creation. Finally, we share insights on C&DW visual analysis, addressing technical and commercial challenges, research trends, and future directions for mixed C&DW analysis. This paper aims to improve the efficiency of C&DW management by providing valuable insights for ongoing and future research and development efforts in this critical sector.
Enhancing TinyBERT for Financial Sentiment Analysis Using GPT-Augmented FinBERT Distillation
In the rapidly evolving field of financial sentiment analysis, the efficiency and accuracy of predictive models are critical due to their significant impact on financial markets. Transformer based models like BERT and large language models (LLMs) like GPT-4, have advanced NLP tasks considerably. Despite their advantages, BERT-based models face challenges with computational intensity in edge computing environments, and the substantial size and compute requirements of LLMs limit their practical deployment. This study proposes leveraging the generative capabilities of LLMs, such as GPT-4 Omni, to create synthetic, domain-specific training data. This approach addresses the challenge of data scarcity and enhances the performance of smaller models by making them competitive with their larger counterparts. The research specifically aims to enhance FinBERT, a BERT model fine-tuned for financial sentiment analysis, and develop TinyFinBERT, a compact transformer model, through a structured, two-tiered knowledge distillation strategy. Using data augmented by GPT-4 Omni, which involves generating new training examples and transforming existing data, we significantly improved the accuracy of FinBERT, preparing it to serve as a teacher model. This enhanced FinBERT then distilled knowledge to TinyFinBERT, employing both GPT-4 Omni and GPT-3.5 Turbo augmented data. The distillation strategy incorporated both logit and intermediate layer distillation. The training and evaluation of TinyFinBERT utilized the PhraseBank dataset and the FiQA 2018 Task1 dataset, achieving performance comparable to FinBERT while being substantially smaller and more efficient. This research demonstrates how LLMs can effectively contribute to the advancement of financial sentiment analysis by enhancing the capabilities of smaller, more efficient models through innovative data augmentation and distillation techniques.
Text2Traj2Text: Learning-by-Synthesis Framework for Contextual Captioning of Human Movement Trajectories
Asano, Hikaru, Yonetani, Ryo, Sekii, Taiki, Ouchi, Hiroki
This paper presents Text2Traj2Text, a novel learning-by-synthesis framework for captioning possible contexts behind shopper's trajectory data in retail stores. Our work will impact various retail applications that need better customer understanding, such as targeted advertising and inventory management. The key idea is leveraging large language models to synthesize a diverse and realistic collection of contextual captions as well as the corresponding movement trajectories on a store map. Despite learned from fully synthesized data, the captioning model can generalize well to trajectories/captions created by real human subjects. Our systematic evaluation confirmed the effectiveness of the proposed framework over competitive approaches in terms of ROUGE and BERT Score metrics.
How the (Tensor-) Brain uses Embeddings and Embodiment to Encode Senses and Decode Symbols
The tensor brain has been introduced as a computational model for perception and memory. We provide an overview of the tensor brain model, including recent developments. The tensor brain has two major layers: the representation layer and the index layer. The representation layer is a model for the subsymbolic global workspace from consciousness research. The state of the representation layer is the cognitive brain state. The index layer contains symbols for concepts, time instances, and predicates. In a bottom-up operation, the cognitive brain state is encoded by the index layer as symbolic labels. In a top-down operation, symbols are decoded and written to the representation layer. This feeds to earlier processing layers as embodiment. The top-down operation became the basis for semantic memory. The embedding vector of a concept forms the connection weights between its index and the representation layer. The embedding is the signature or ``DNA'' of a concept, which is decoded by the brain when its index is activated. It integrates all that is known about a concept from different experiences, modalities, and symbolic decodings. Although being computational, it has been suggested that the tensor brain might be related to the actual operation of the brain. The sequential nature of symbol generation might have been a prerequisite to the generation of natural language. We describe an attention mechanism and discuss multitasking by multiplexing. We emphasize the inherent multimodality of the tensor brain. Finally, we discuss embedded and symbolic reasoning.
Green Federated Learning: A new era of Green Aware AI
Thakur, Dipanwita, Guzzo, Antonella, Fortino, Giancarlo
The development of AI applications, especially in large-scale wireless networks, is growing exponentially, alongside the size and complexity of the architectures used. Particularly, machine learning is acknowledged as one of today's most energy-intensive computational applications, posing a significant challenge to the environmental sustainability of next-generation intelligent systems. Achieving environmental sustainability entails ensuring that every AI algorithm is designed with sustainability in mind, integrating green considerations from the architectural phase onwards. Recently, Federated Learning (FL), with its distributed nature, presents new opportunities to address this need. Hence, it's imperative to elucidate the potential and challenges stemming from recent FL advancements and their implications for sustainability. Moreover, it's crucial to furnish researchers, stakeholders, and interested parties with a roadmap to navigate and understand existing efforts and gaps in green-aware AI algorithms. This survey primarily aims to achieve this objective by identifying and analyzing over a hundred FL works, assessing their contributions to green-aware artificial intelligence for sustainable environments, with a specific focus on IoT research. It delves into current issues in green federated learning from an energy-efficient standpoint, discussing potential challenges and future prospects for green IoT application research.