Rabiee, Azam
Question-type Identification for Academic Questions in Online Learning Platform
Rabiee, Azam, Goel, Alok, D'Souza, Johnson, Khanwalkar, Saurabh
Online learning platforms provide learning materials and answers to students' academic questions by experts, peers, or systems. This paper explores question-type identification as a step in content understanding for an online learning platform. The aim of the question-type identifier is to categorize question types based on their structure and complexity, using the question text, subject, and structural features. We have defined twelve question-type classes, including Multiple-Choice Question (MCQ), essay, and others. We have compiled an internal dataset of students' questions and used a combination of weak-supervision techniques and manual annotation. We then trained a BERT-based ensemble model on this dataset and evaluated this model on a separate human-labeled test set. Our experiments yielded an F1-score of 0.94 for MCQ binary classification and promising results for 12-class multilabel classification. We deployed the model in our online learning platform as a crucial enabler for content understanding to enhance the student learning experience.
D-Point Trigonometric Path Planning based on Q-Learning in Uncertain Environments
Jeihaninejad, Ehsan, Rabiee, Azam
Finding the optimum path for a robot for moving from start to the goal position through obstacles is still a challenging issue. Thi s paper presents a novel path planning method, named D - point trigonometric, based on Q - learning algorithm for dynamic and uncertain environments, in which all the obstacles and the target are moving. We define a new state, action and reward functions for t he Q - learning by which the agent can find the best action in every state to reach the goal in the most appropriate path. Moreover, the experiment s in Unity3D confirmed the high convergence speed, the high hit rate, as well as the low dependency on environmental parameters of the proposed method compared with an opponent approach. The planning has been considered as a challenging concern in video games [1], transportation systems [2], and mobile robots [3] [4] . A s the most important path planning issues, w e can refer to the dynamics and the uncertainty of the environment, the smoothness and the length of the path, obstacle avoidance, and the computation al cost . In the last few decades, researchers have done numerous research efforts to present new approaches to solve them [5] [6] [7] [8] . Generally, most of the path planning approaches are categorized to one of the following methods [9] [10] [11]: ( 1) Classical methods (a) Computational geometry (CG) (b) Probabilistic r oadmap (PRM) (c) Potential fields method (PFM) ( 2) Heuristic and meta heuristic methods (a) Soft computing (b) Hybrid algorithms Since the complexity and the execution time of CG methods were high [11], PRMs were proposed to red uce the search space using techniques like milestones [12] .
On the Efficiency of the Neuro-Fuzzy Classifier for User Knowledge Modeling Systems
Jeihaninejad, Ehsan, Rabiee, Azam
User knowledge modeling systems are used as the most effective technology for grabbing new user's attention. Moreover, the quality of service (QOS) is increased by these intelligent services. This paper proposes two user knowledge classifiers based on artificial neural networks used as one of the influential parts of knowledge modeling systems. We employed multi-layer perceptron (MLP) and adaptive neural fuzzy inference system (ANFIS) as the classifiers. Moreover, we used real data contains the user's degree of study time, repetition number, their performance in exam, as well as the learning percentage, as our classifier's inputs. Compared with well-known methods like KNN and Bayesian classifiers used in other research with the same data sets, our experiments present better performance. Although, the number of samples in the train set is not large enough, the performance of the neuro-fuzzy classifier in the test set is 98.6% which is the best result in comparison with others. However, the comparison of MLP toward the ANFIS results presents performance reduction, although the MLP performance is more efficient than other methods like Bayesian and KNN. As our goal is evaluating and reporting the efficiency of a neuro-fuzzy classifier for user knowledge modeling systems, we utilized many different evaluation metrics such as Receiver Operating Characteristic and the Area Under its Curve, Total Accuracy, and Kappa statistics.
A Fully Time-domain Neural Model for Subband-based Speech Synthesizer
Rabiee, Azam, Kim, Geonmin, Kim, Tae-Ho, Lee, Soo-Young
This paper introduces a deep neural network model for subband-based speech synthesizer. The model benefits from the short bandwidth of the subband signals to reduce the complexity of the time-domain speech generator. We employed the multi-level wavelet analysis/synthesis to decompose/reconstruct the signal into subbands in time domain. Inspired from the WaveNet, a convolutional neural network (CNN) model predicts subband speech signals fully in time domain. Due to the short bandwidth of the subbands, a simple network architecture is enough to train the simple patterns of the subbands accurately. In the ground truth experiments with teacher-forcing, the subband synthesizer outperforms the fullband model significantly in terms of both subjective and objective measures. In addition, by conditioning the model on the phoneme sequence using a pronunciation dictionary, we have achieved the fully time-domain neural model for subband-based text-to-speech (TTS) synthesizer, which is nearly end-to-end. The generated speech of the subband TTS shows comparable quality as the fullband one with a slighter network architecture for each subband.
Adjusting Pleasure-Arousal-Dominance for Continuous Emotional Text-to-speech Synthesizer
Rabiee, Azam, Kim, Tae-Ho, Lee, Soo-Young
Emotion is not limited to discrete categories of happy, sad, angry, fear, disgust, surprise, and so on. Instead, each emotion category is projected into a set of nearly independent dimensions, named pleasure (or valence), arousal, and dominance, known as PAD. The value of each dimension varies from -1 to 1, such that the neutral emotion is in the center with all-zero values. Training an emotional continuous text-to-speech (TTS) synthesizer on the independent dimensions provides the possibility of emotional speech synthesis with unlimited emotion categories. Our end-to-end neural speech synthesizer is based on the well-known Tacotron. Empirically, we have found the optimum network architecture for injecting the 3D PADs. Moreover, the PAD values are adjusted for the speech synthesis purpose.