AITopics | Tzinis, Efthymios

Collaborating Authors

Tzinis, Efthymios

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Latent Iterative Refinement for Modular Source Separation

Bralios, Dimitrios, Tzinis, Efthymios, Wichern, Gordon, Smaragdis, Paris, Roux, Jonathan Le

arXiv.org Artificial IntelligenceOct-15-2023

Traditional source separation approaches train deep neural network models end-to-end with all the data available at once by minimizing the empirical risk on the whole training set. On the inference side, after training the model, the user fetches a static computation graph and runs the full model on some specified observed mixture signal to get the estimated source signals. Additionally, many of those models consist of several basic processing blocks which are applied sequentially. We argue that we can significantly increase resource efficiency during both training and inference stages by reformulating a model's training and inference procedures as iterative mappings of latent signal representations. First, we can apply the same processing block more than once on its output to refine the input signal and consequently improve parameter efficiency. During training, we can follow a block-wise procedure which enables a reduction on memory requirements. Thus, one can train a very complicated network structure using significantly less computation compared to end-to-end training. During inference, we can dynamically adjust how many processing blocks and iterations of a specific block an input signal needs using a gating module.

artificial intelligence, iteration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICASSP49357.2023.10096897

2211.11917

Country: North America > United States > Illinois > Champaign County > Urbana (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Learning Representations for New Sound Classes With Continual Self-Supervised Learning

Wang, Zhepei, Subakan, Cem, Jiang, Xilin, Wu, Junkai, Tzinis, Efthymios, Ravanelli, Mirco, Smaragdis, Paris

arXiv.org Artificial IntelligenceDec-13-2022

In this paper, we work on a sound recognition system that continually incorporates new sound classes. Our main goal is to develop a framework where the model can be updated without relying on labeled data. For this purpose, we propose adopting representation learning, where an encoder is trained using unlabeled data. This learning framework enables the study and implementation of a practically relevant use case where only a small amount of the labels is available in a continual learning context. We also make the empirical observation that a similarity-based representation learning method within this framework is robust to forgetting even if no explicit mechanism against forgetting is employed. We show that this approach obtains similar performance compared to several distillation-based continual learning methods when employed on self-supervised representation learning methods.

artificial intelligence, machine learning, representation, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LSP.2022.3229643

2205.0739

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.42)

Add feedback

Continual self-training with bootstrapped remixing for speech enhancement

Tzinis, Efthymios, Adi, Yossi, Ithapu, Vamsi K., Xu, Buye, Kumar, Anurag

arXiv.org Artificial IntelligenceJan-29-2022

We propose RemixIT, a simple and novel self-supervised training method for speech enhancement. The proposed method is based on a continuously self-training scheme that overcomes limitations from previous studies including assumptions for the in-domain noise distribution and having access to clean target signals. Specifically, a separation teacher model is pre-trained on an out-of-domain dataset and is used to infer estimated target signals for a batch of in-domain mixtures. Next, we bootstrap the mixing process by generating artificial mixtures using permuted estimated clean and noise signals. Finally, the student model is trained using the permuted estimated sources as targets while we periodically update teacher's weights using the latest student model. Our experiments show that RemixIT outperforms several previous state-of-the-art self-supervised methods under multiple speech enhancement tasks. Additionally, RemixIT provides a seamless alternative for semi-supervised and unsupervised domain adaptation for speech enhancement tasks, while being general enough to be applied to any separation task and paired with any separation model.

artificial intelligence, machine learning, speech enhancement, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICASSP43922.2022.9747463

2110.10103

Genre: Research Report (1.00)

Industry: Education (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Sudo rm -rf: Efficient Networks for Universal Audio Source Separation

Tzinis, Efthymios, Wang, Zhepei, Smaragdis, Paris

arXiv.org Machine LearningJul-14-2020

In this paper, we present an efficient neural network for end-to-end general purpose audio source separation. Specifically, the backbone structure of this convolutional network is the SUccessive DOwnsampling and Resampling of Multi-Resolution Features (SuDoRMRF) as well as their aggregation which is performed through simple one-dimensional convolutions. In this way, we are able to obtain high quality audio source separation with limited number of floating point operations, memory requirements, number of parameters and latency. Our experiments on both speech and environmental sound separation datasets show that SuDoRMRF performs comparably and even surpasses various state-of-the-art approaches with significantly higher computational resource requirements.

deep learning, neural network, separation, (19 more...)

arXiv.org Machine Learning

2007.06833

Genre: Research Report > Promising Solution (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Hardware (0.89)

Add feedback

Two-Step Sound Source Separation: Training on Learned Latent Targets

Tzinis, Efthymios, Venkataramani, Shrikant, Wang, Zhepei, Subakan, Cem, Smaragdis, Paris

arXiv.org Machine LearningOct-23-2019

In this paper, we propose a two-step training procedure for source separation via a deep neural network. In the first step we learn a transform (and it's inverse) to a latent space where masking-based separation performance using oracles is optimal. For the second step, we train a separation module that operates on the previously learned space. In order to do so, we also make use of a scale-invariant signal to distortion ratio (SI-SDR) loss function that works in the latent space, and we prove that it lower-bounds the SI-SDR in the time domain. We run various sound separation experiments that show how this approach can obtain better performance as compared to systems that learn the transform and the separation module jointly. The proposed methodology is general enough to be applicable to a large class of neural network end-to-end separation systems.

deep learning, neural network, separation, (19 more...)

arXiv.org Machine Learning

1910.09804

Country: North America (0.28)

Genre:

Research Report (0.50)
Workflow (0.49)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Continual Learning of New Sound Classes using Generative Replay

Wang, Zhepei, Subakan, Cem, Tzinis, Efthymios, Smaragdis, Paris, Charlin, Laurent

arXiv.org Machine LearningJun-3-2019

Continual learning consists in incrementally training a model on a sequence of datasets and testing on the union of all datasets. In this paper, we examine continual learning for the problem of sound classification, in which we wish to refine already trained models to learn new sound classes. In practice one does not want to maintain all past training data and retrain from scratch, but naively updating a model with new data(sets) results in a degradation of already learned tasks, which is referred to as "catastrophic forgetting." We develop a generative replay procedure for generating training audio spectrogram data, in place of keeping older training datasets. We show that by incrementally refining a classifier with generative replay a generator that is 4% of the size of all previous training data matches the performance of refining the classifier keeping 20% of all previous training data. We thus conclude that we can extend a trained sound classifier to learn new classes without having to keep previously used datasets.

artificial intelligence, continual learning, neural network, (18 more...)

arXiv.org Machine Learning

1906.00654

Country: North America > Canada > Quebec (0.14)

Genre: Research Report (0.65)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.99)

Add feedback

Bootstrapped Coordinate Search for Multidimensional Scaling

Tzinis, Efthymios

arXiv.org Machine LearningFeb-4-2019

In this work, a unified framework for gradient-free Multidimensional Scaling (MDS) based on Coordinate Search (CS) is proposed. This family of algorithms is an instance of General Pattern Search (GPS) methods which avoid the explicit computation of derivatives but instead evaluate the objective function while searching on coordinate steps of the embedding space. The backbone element of CSMDS framework is the corresponding probability matrix that correspond to how likely is each corresponding coordinate to be evaluated. We propose a Bootstrapped instance of CSMDS (BS CSMDS) which enhances the probability of the direction that decreases the most the objective function while also reducing the corresponding probability of all the other coordinates. BS CSMDS manages to avoid unnecessary function evaluations and result to significant speedup over other CSMDS alternatives while also obtaining the same error rate. Experiments on both synthetic and real data reveal that BS CSMDS performs consistently better than other CSMDS alternatives under various experimental setups.

artificial intelligence, csmd, optimization problem, (19 more...)

arXiv.org Machine Learning

1902.01482

Country: North America > United States > Illinois > Champaign County (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures using Spatial Information

Tzinis, Efthymios, Venkataramani, Shrikant, Smaragdis, Paris

arXiv.org Machine LearningNov-9-2018

UNSUPERVISED DEEP CLUSTERING FOR SOURCE SEPARATION: DIRECT LEARNING FROM MIXTURES USING SPATIAL INFORMATION Efthymios Tzinis ] Shrikant Venkataramani ] Paris Smaragdis ][ ] University of Illinois at Urbana-Champaign, Department of Computer Science [ Adobe Research ABSTRACT We present a monophonic source separation system that is trained by only observing mixtures with no ground truth separation information. We use a deep clustering approach which trains on multi-channel mixtures and learns to project spectrogram bins to source clusters that correlate with various spatial features. We show that using such a training process we can obtain separation performance that is as good as making use of ground truth separation information. Once trained, this system is capable of performing sound separation on monophonic inputs, despite having learned how to do so using multi-channel recordings. Index Terms -- Deep clustering, source separation, unsupervised learning 1. INTRODUCTION A central problem when designing source separation systems is that of defining what constitutes a source.

deep learning, neural network, separation, (19 more...)

arXiv.org Machine Learning

1811.01531

Country: North America > United States > Illinois (0.24)

Genre: Research Report (0.82)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (0.92)

Add feedback

Integrating Recurrence Dynamics for Speech Emotion Recognition

Tzinis, Efthymios, Paraskevopoulos, Georgios, Baziotis, Christos, Potamianos, Alexandros

arXiv.org Machine LearningNov-9-2018

We investigate the performance of features that can capture nonlinear recurrence dynamics embedded in the speech signal for the task of Speech Emotion Recognition (SER). Reconstruction of the phase space of each speech frame and the computation of its respective Recurrence Plot (RP) reveals complex structures which can be measured by performing Recurrence Quantification Analysis (RQA). These measures are aggregated by using statistical functionals over segment and utterance periods. We report SER results for the proposed feature set on three databases using different classification methods. When fusing the proposed features with traditional feature sets, e.g., [1], we show an improvement in unweighted accuracy of up to 5.7% and 10.7% on Speaker-Dependent (SD) and Speaker-Independent (SI) SER tasks, respectively, over the baseline [1]. Following a segment-based approach we demonstrate state-of- the-art performance on IEMOCAP using a Bidirectional Recurrent Neural Network.

deep learning, emotion recognition, neural network, (21 more...)

arXiv.org Machine Learning

doi: 10.21437/Interspeech.2018-1377

1811.04133

Country:

North America > United States (0.14)
Europe > Greece (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Pattern Search Multidimensional Scaling

Paraskevopoulos, Georgios, Tzinis, Efthymios, Vlatakis-Gkaragkounis, Emmanuel-Vasileios, Potamianos, Alexandros

arXiv.org Machine LearningJun-6-2018

We present a novel view of nonlinear manifold learning using derivative-free optimization techniques. Specifically, we propose an extension of the classical multi-dimensional scaling (MDS) method, where instead of performing gradient descent, we sample and evaluate possible "moves" in a sphere of fixed radius for each point in the embedded space. A fixed-point convergence guarantee can be shown by formulating the proposed algorithm as an instance of General Pattern Search (GPS) framework. Evaluation on both clean and noisy synthetic datasets shows that pattern search MDS can accurately infer the intrinsic geometry of manifolds embedded in high-dimensional spaces. Additionally, experiments on real data, even under noisy conditions, demonstrate that the proposed pattern search MDS yields state-of-the-art results.

algorithm, neural network, optimization problem, (20 more...)

arXiv.org Machine Learning

1806.00416

Country: North America > United States > Pennsylvania (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.48)

Add feedback