Goto

Collaborating Authors

 P., Prathosh A.


CoNOAir: A Neural Operator for Forecasting Carbon Monoxide Evolution in Cities

arXiv.org Artificial Intelligence

Carbon Monoxide (CO) is a dominant pollutant in urban areas due to the energy generation from fossil fuels for industry, automobile, and domestic requirements. Forecasting the evolution of CO in real-time can enable the deployment of effective early warning systems and intervention strategies. However, the computational cost associated with the physics and chemistry-based simulation makes it prohibitive to implement such a model at the city and country scale. To address this challenge, here, we present a machine learning model based on neural operator, namely, Complex Neural Operator for Air Quality (CoNOAir), that can effectively forecast CO concentrations. We demonstrate this by developing a country-level model for short-term (hourly) and long-term (72-hour) forecasts of CO concentrations. Our model outperforms state-of-the-art models such as Fourier neural operators (FNO) and provides reliable predictions for both short and long-term forecasts. We further analyse the capability of the model to capture extreme events and generate forecasts in urban cities in India. Interestingly, we observe that the model predicts the next hour CO concentrations with R2 values greater than 0.95 for all the cities considered. The deployment of such a model can greatly assist the governing bodies to provide early warning, plan intervention strategies, and develop effective strategies by considering several what-if scenarios. Altogether, the present approach could provide a fillip to real-time predictions of CO pollution in urban cities.


Adversarial Approximate Inference for Speech to Electroglottograph Conversion

arXiv.org Machine Learning

Speech produced by human vocal apparatus conveys substantial non-semantic information including the gender of the speaker, voice quality, affective state, abnormalities in the vocal apparatus etc. Such information is attributed to the properties of the voice source signal, which is usually estimated from the speech signal. However, most of the source estimation techniques depend heavily on the goodness of the model assumptions and are prone to noise. A popular alternative is to indirectly obtain the source information through the Electroglottographic (EGG) signal that measures the electrical admittance around the vocal folds using a dedicated hardware. In this paper, we address the problem of estimating the EGG signal directly from the speech signal, devoid of any hardware. Sampling from the intractable conditional distribution of the EGG signal given the speech signal is accomplished through optimization of an evidence lower bound. This is constructed via minimization of the KL-divergence between the true and the approximated posteriors of a latent variable learned using a deep neural auto-encoder that serves an informative prior which reconstructs the EGG signal. We demonstrate the efficacy of the method to generate EGG signal by conducting several experiments on datasets comprising multiple speakers, voice qualities, noise settings and speech pathologies. The proposed method is evaluated on many benchmark metrics and is found to agree with the gold standards while being better than the state-of-the-art algorithms on a few tasks such as epoch extraction.