vocal technique
Transfer Learning in Vocal Education: Technical Evaluation of Limited Samples Describing Mezzo-soprano
Hou, Zhenyi, Zhao, Xu, Ye, Kejie, Sheng, Xinyu, Jiang, Shanggerile, Xia, Jiajing, Zhang, Yitao, Ban, Chenxi, Luo, Daijun, Chen, Jiaxing, Zou, Yan, Feng, Yuchao, Fan, Guangyu, Yuan, Xin
Vocal education in the music field is difficult to quantify due to the individual differences in singers' voices and the different quantitative criteria of singing techniques. Deep learning has great potential to be applied in music education due to its efficiency to handle complex data and perform quantitative analysis. However, accurate evaluations with limited samples over rare vocal types, such as Mezzo-soprano, requires extensive well-annotated data support using deep learning models. In order to attain the objective, we perform transfer learning by employing deep learning models pre-trained on the ImageNet and Urbansound8k datasets for the improvement on the precision of vocal technique evaluation. Furthermore, we tackle the problem of the lack of samples by constructing a dedicated dataset, the Mezzo-soprano Vocal Set (MVS), for vocal technique assessment. Our experimental results indicate that transfer learning increases the overall accuracy (OAcc) of all models by an average of 8.3%, with the highest accuracy at 94.2%. We not only provide a novel approach to evaluating Mezzo-soprano vocal techniques but also introduce a new quantitative assessment method for music education.
- Asia > China > Shanghai > Shanghai (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- Oceania > Australia > Queensland > Brisbane (0.04)
- (7 more...)
- Health & Medicine (1.00)
- Education > Curriculum > Subject-Specific Education (0.86)
EMVD dataset: a dataset of extreme vocal distortion techniques used in heavy metal
Tailleur, Modan, Pinquier, Julien, Millot, Laurent, Vogel, Corsin, Lagrange, Mathieu
In this paper, we introduce the Extreme Metal Vocals Dataset, which comprises a collection of recordings of extreme vocal techniques performed within the realm of heavy metal music. The dataset consists of 760 audio excerpts of 1 second to 30 seconds long, totaling about 100 min of audio material, roughly composed of 60 minutes of distorted voices and 40 minutes of clear voice recordings. These vocal recordings are from 27 different singers and are provided without accompanying musical instruments or post-processing effects. The distortion taxonomy within this dataset encompasses four distinct distortion techniques and three vocal effects, all performed in different pitch ranges. Performance of a state-of-the-art deep learning model is evaluated for two different classification tasks related to vocal techniques, demonstrating the potential of this resource for the audio processing community.
- Europe > France > Pays de la Loire > Loire-Atlantique > Nantes (0.06)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.05)
- North America > United States > Ohio > Hamilton County > Cincinnati (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Singing Voice Conversion with Disentangled Representations of Singer and Vocal Technique Using Variational Autoencoders
Luo, Yin-Jyun, Hsu, Chin-Chen, Agres, Kat, Herremans, Dorien
We propose a flexible framework that deals with both singer conversion and singers vocal technique conversion. The proposed model is trained on non-parallel corpora, accommodates many-to-many conversion, and leverages recent advances of variational autoencoders. It employs separate encoders to learn disentangled latent representations of singer identity and vocal technique separately, with a joint decoder for reconstruction. Conversion is carried out by simple vector arithmetic in the learned latent spaces. Both a quantitative analysis as well as a visualization of the converted spectrograms show that our model is able to disentangle singer identity and vocal technique and successfully perform conversion of these attributes. To the best of our knowledge, this is the first work to jointly tackle conversion of singer identity and vocal technique based on a deep learning approach.
- North America > United States > California > Los Angeles County > Los Angeles (0.28)
- Asia > Singapore (0.05)
- South America > Colombia > Meta Department > Villavicencio (0.04)