Real-Time Pitch/F0 Detection Using Spectrogram Images and Convolutional Neural Networks
–arXiv.org Artificial Intelligence
-- Pitch (also called F0 or fundamental frequency) is a very important voice feature for smart mobility features, such as driver's emotion detection, vehicle personalized profiles, and secured speaker identification. This paper presents a novel approach to de tect F0 through Convolutional Neural Networks (CNN) and image processing techniques to directly estimate pitch from spectrogram images. Our new approach demonstrates a very good detection accuracy; a total of 9 2 % of predicted pitch contours have strong or moderate correlations to the true pitch contours. Furthermore, t he experimental comparison between our new approach and other state - of - the - art CNN methods reveals that our approach can enhance the detection rate by approximately 5% across various Signal - to - Noise Ratio (SNR) conditions . Pitch detection is very widely used for smart mobility features. For example, as shown in Fig.1, pitch contour can be used to train a deep learning neural network for driver's emotion detection, which can alert road rage.
arXiv.org Artificial Intelligence
Apr-9-2025
- Country:
- North America > United States
- Michigan > Macomb County
- Warren (0.05)
- New York (0.04)
- Michigan > Macomb County
- North America > United States
- Genre:
- Overview > Innovation (0.34)
- Research Report > Promising Solution (0.34)
- Industry:
- Automobiles & Trucks (0.47)
- Transportation > Ground
- Road (0.48)
- Technology: