AITopics | Rizos, Georgios

Collaborating Authors

Rizos, Georgios

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LeanTTA: A Backpropagation-Free and Stateless Approach to Quantized Test-Time Adaptation on Edge Devices

Dong, Cynthia, Jia, Hong, Kwon, Young D., Rizos, Georgios, Mascolo, Cecilia

arXiv.org Artificial IntelligenceMar-20-2025

While there are many advantages to deploying machine learning models on edge devices, the resource constraints of mobile platforms, the dynamic nature of the environment, and differences between the distribution of training versus in-the-wild data make such deployments challenging. Current test-time adaptation methods are often memory-intensive and not designed to be quantization-compatible or deployed on low-resource devices. To address these challenges, we present LeanTTA, a novel backpropagation-free and stateless framework for quantized test-time adaptation tailored to edge devices. Our approach minimizes computational costs by dynamically updating normalization statistics without backpropagation, which frees LeanTTA from the common pitfall of relying on large batches and historical data, making our method robust to realistic deployment scenarios. Our approach is the first to enable further computational gains by combining partial adaptation with quantized module fusion. We validate our framework across sensor modalities, demonstrating significant improvements over state-of-the-art TTA methods, including a 15.7% error reduction, peak memory usage of only 11.2MB for ResNet18, and fast adaptation within an order-of-magnitude of normal inference speeds on-device. LeanTTA provides a robust solution for achieving the right trade offs between accuracy and system efficiency in edge deployments, addressing the unique challenges posed by limited data and varied operational conditions.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2503.15889

Country: North America > United States > New York (0.14)

Genre: Research Report > Promising Solution (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Backpropagation (0.83)

Add feedback

Towards Open Respiratory Acoustic Foundation Models: Pretraining and Benchmarking

Zhang, Yuwei, Xia, Tong, Han, Jing, Wu, Yu, Rizos, Georgios, Liu, Yang, Mosuily, Mohammed, Chauhan, Jagmohan, Mascolo, Cecilia

arXiv.org Artificial IntelligenceJun-23-2024

Respiratory audio, such as coughing and breathing sounds, has predictive power for a wide range of healthcare applications, yet is currently under-explored. The main problem for those applications arises from the difficulty in collecting large labeled task-specific data for model development. Generalizable respiratory acoustic foundation models pretrained with unlabeled data would offer appealing advantages and possibly unlock this impasse. However, given the safety-critical nature of healthcare applications, it is pivotal to also ensure openness and replicability for any proposed foundation model solution. To this end, we introduce OPERA, an OPEn Respiratory Acoustic foundation model pretraining and benchmarking system, as the first approach answering this need. We curate large-scale respiratory audio datasets ( 136K samples, 440 hours), pretrain three pioneering foundation models, and build a benchmark consisting of 19 downstream respiratory health tasks for evaluation. Our pretrained models demonstrate superior performance (against existing acoustic models pretrained with general audio on 16 out of 19 tasks) and generalizability (to unseen datasets and new respiratory audio modalities). This highlights the great promise of respiratory acoustic foundation models and encourages more studies using OPERA as an open resource to accelerate research on respiratory audio for health. The system is accessible from https://github.

data quality, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2406.16148

Country:

Europe > United Kingdom (0.28)
Europe > Portugal (0.28)
Asia > Middle East > Republic of Türkiye > Hatay Province (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(3 more...)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Quality (0.93)

Add feedback

An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmentation

He, Xiangheng, Chen, Junjie, Rizos, Georgios, Schuller, Björn W.

arXiv.org Artificial IntelligenceJul-18-2021

Emotional Voice Conversion (EVC) aims to convert the emotional style of a source speech signal to a target style while preserving its content and speaker identity information. Previous emotional conversion studies do not disentangle emotional information from emotion-independent information that should be preserved, thus transforming it all in a monolithic manner and generating audio of low quality, with linguistic distortions. To address this distortion problem, we propose a novel StarGAN framework along with a two-stage training process that separates emotional features from those independent of emotion by using an autoencoder with two encoders as the generator of the Generative Adversarial Network (GAN). The proposed model achieves favourable results in both the objective evaluation and the subjective evaluation in terms of distortion, which reveals that the proposed model can effectively reduce distortion. Furthermore, in data augmentation experiments for end-to-end speech emotion recognition, the proposed StarGAN model achieves an increase of 2% in Micro-F1 and 5% in Macro-F1 compared to the baseline StarGAN model, which indicates that the proposed model is more valuable for data augmentation.

artificial intelligence, conversion, neural network, (18 more...)

arXiv.org Artificial Intelligence

2107.08361

Country: Asia > Japan (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.57)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.35)

Add feedback

Poisson CNN: Convolutional Neural Networks for the Solution of the Poisson Equation with Varying Meshes and Dirichlet Boundary Conditions

Özbay, Ali Girayhan, Laizet, Sylvain, Tzirakis, Panagiotis, Rizos, Georgios, Schuller, Björn

arXiv.org Machine LearningOct-18-2019

The Poisson equation is commonly encountered in engineering, including in computational fluid dynamics where it is needed to compute corrections to the pressure field. We propose a novel fully convolutional neural network (CNN) architecture to infer the solution of the Poisson equation on a 2D Cartesian grid of varying size and spacing given the right hand side term, arbitrary Dirichlet boundary conditions and grid parameters which provides unprecendented versatility in this application. The boundary conditions are handled using a novel approach by decomposing the original Poisson problem into a homogeneous Poisson problem plus four inhomogeneous Laplace sub-problems. The model is trained using a novel loss function approximating the continuous $L^p$ norm between the prediction and the target. Analytical test cases indicate that our CNN architecture is capable of predicting the correct solution of a Poisson problem with mean percentage errors of 15% and promises improvements in wall-clock runtimes for large problems. Furthermore, even when predicting on meshes denser than previously encountered, our model demonstrates encouraging capacity to reproduce the correct solution profile.

deep learning, neural network, poisson equation, (21 more...)

arXiv.org Machine Learning

1910.08613

Genre:

Research Report (1.00)
Overview (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback