window approach
Human Feedback Driven Dynamic Speech Emotion Recognition
Fedorov, Ilya, Korobchenko, Dmitry
This work proposes to explore a new area of dynamic speech emotion recognition. Unlike traditional methods, we assume that each audio track is associated with a sequence of emotions active at different moments in time. The study particularly focuses on the animation of emotional 3D avatars. We propose a multi-stage method that includes the training of a classical speech emotion recognition model, synthetic generation of emotional sequences, and further model improvement based on human feedback. Additionally, we introduce a novel approach to modeling emotional mixtures based on the Dirichlet distribution. The models are evaluated based on ground-truth emotions extracted from a dataset of 3D facial animations. We compare our models against the sliding window approach. Our experimental results show the effectiveness of Dirichlet-based approach in modeling emotional mixtures. Incorporating human feedback further improves the model quality while providing a simplified annotation procedure.
- North America > United States (0.04)
- Europe > Switzerland (0.04)
- Europe > Italy > Tuscany > Florence (0.04)
- North America > United States > Maryland > Prince George's County > College Park (0.14)
- North America > United States > California > Santa Clara County > Santa Clara (0.04)
A novel language model for predicting serious adverse event results in clinical trials from their prospective registrations
Hu, Qixuan, Zhang, Xumou, Kim, Jinman, Bourgeois, Florence, Dunn, Adam G.
Objectives: With accurate estimates of expected safety results, clinical trials could be better designed and monitored. We evaluated methods for predicting serious adverse event (SAE) results in clinical trials using information only from their registrations prior to the trial. Material and Methods: We analyzed 22,107 two-arm parallel interventional clinical trials from ClinicalTrials.gov with structured summary results. Two prediction models were developed: a classifier predicting whether a greater proportion of participants in an experimental arm would have SAEs (area under the receiver operating characteristic curve; AUC) compared to the control arm, and a regression model to predict the proportion of participants with SAEs in the control arms (root mean squared error; RMSE). A transfer learning approach using pretrained language models (e.g., ClinicalT5, BioBERT) was used for feature extraction, combined with a downstream model for prediction. To maintain semantic representation in long trial texts exceeding localized language model input limits, a sliding window method was developed for embedding extraction. Results: The best model (ClinicalT5+Transformer+MLP) had 77.6% AUC when predicting which trial arm had a higher proportion of SAEs. When predicting SAE proportion in the control arm, the same model achieved RMSE of 18.6%. The sliding window approach consistently outperformed direct comparisons. Across 12 classifiers, the average absolute AUC increase was 2.00%, and absolute RMSE reduction was 1.58% across 12 regressors. Discussion: Summary results data from ClinicalTrials.gov remains underutilized. Predicted results of publicly reported trials provides an opportunity to identify discrepancies between expected and reported safety results.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > United States > District of Columbia > Washington (0.04)
Multi-Class Human/Object Detection on Robot Manipulators using Proprioceptive Sensing
Hehli, Justin, Heiniger, Marco, Rezayati, Maryam, van de Venn, Hans Wernher
In physical human-robot collaboration (pHRC) settings, humans and robots collaborate directly in shared environments. Robots must analyze interactions with objects to ensure safety and facilitate meaningful workflows. One critical aspect is human/object detection, where the contacted object is identified. Past research introduced binary machine learning classifiers to distinguish between soft and hard objects. This study improves upon those results by evaluating three-class human/object detection models, offering more detailed contact analysis. A dataset was collected using the Franka Emika Panda robot manipulator, exploring preprocessing strategies for time-series analysis. Models including LSTM, GRU, and Transformers were trained on these datasets. The best-performing model achieved 91.11\% accuracy during real-time testing, demonstrating the feasibility of multi-class detection models. Additionally, a comparison of preprocessing strategies suggests a sliding window approach is optimal for this task.
- Europe > Switzerland > Zürich > Zürich (0.04)
- Africa > Central African Republic > Ombella-M'Poko > Bimbo (0.04)
Using Machine Learning and Neural Networks to Analyze and Predict Chaos in Multi-Pendulum and Chaotic Systems
Ramachandruni, Vasista, Nara, Sai Hruday Reddy, Lalu, Geo, Yang, Sabrina, Kumar, Mohit Ramesh, Jain, Aarjav, Mehta, Pratham, Koo, Hankyu, Damonte, Jason, Akl, Marx
A chaotic system is a highly volatile system characterized by its sensitive dependence on initial conditions and outside factors. Chaotic systems are prevalent throughout the world today: in weather patterns, disease outbreaks, and even financial markets. Chaotic systems are seen in every field of science and humanities, so being able to predict these systems is greatly beneficial to society. In this study, we evaluate 10 different machine learning models and neural networks [1] based on Root Mean Squared Error (RMSE) and R^2 values for their ability to predict one of these systems, the multi-pendulum. We begin by generating synthetic data representing the angles of the pendulum over time using the Runge Kutta Method for solving 4th Order Differential Equations (ODE-RK4) [2]. At first, we used the single-step sliding window approach, predicting the 50st step after training for steps 0-49 and so forth. However, to more accurately cover chaotic motion and behavior in these systems, we transitioned to a time-step based approach. Here, we trained the model/network on many initial angles and tested it on a completely new set of initial angles, or 'in-between' to capture chaotic motion to its fullest extent. We also evaluated the stability of the system using Lyapunov exponents. We concluded that for a double pendulum, the best model was the Long Short Term Memory Network (LSTM)[3] for the sliding window and time step approaches in both friction and frictionless scenarios. For triple pendulum, the Vanilla Recurrent Neural Network (VRNN)[4] was the best for the sliding window and Gated Recurrent Network (GRU) [5] was the best for the time step approach, but for friction, LSTM was the best.
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > United States > New York > Broome County > Binghamton (0.04)
- North America > United States > Michigan (0.04)
- (2 more...)
- Banking & Finance (0.54)
- Health & Medicine (0.34)
Toward Integrating Semantic-aware Path Planning and Reliable Localization for UAV Operations
Canh, Thanh Nguyen, Ngo, Huy-Hoang, HoangVan, Xiem, Chong, Nak Young
Localization is one of the most crucial tasks for Unmanned Aerial Vehicle systems (UAVs) directly impacting overall performance, which can be achieved with various sensors and applied to numerous tasks related to search and rescue operations, object tracking, construction, etc. However, due to the negative effects of challenging environments, UAVs may lose signals for localization. In this paper, we present an effective path-planning system leveraging semantic segmentation information to navigate around texture-less and problematic areas like lakes, oceans, and high-rise buildings using a monocular camera. We introduce a real-time semantic segmentation architecture and a novel keyframe decision pipeline to optimize image inputs based on pixel distribution, reducing processing time. A hierarchical planner based on the Dynamic Window Approach (DWA) algorithm, integrated with a cost map, is designed to facilitate efficient path planning. The system is implemented in a photo-realistic simulation environment using Unity, aligning with segmentation model parameters. Comprehensive qualitative and quantitative evaluations validate the effectiveness of our approach, showing significant improvements in the reliability and efficiency of UAV localization in challenging environments.
- Information Technology > Robotics & Automation (0.66)
- Aerospace & Defense (0.49)
Unsupervised Fault Detection using SAM with a Moving Window Approach
Automated f ault detection and monitoring in engineering are critical but frequently difficult owing to the necessity for collecting and labeling large amounts of defective samples . We present an unsupervised method that uses the high end Segment Anything Model (SAM) and a moving window approach. SAM has gained recognition in AI image segmentation communities for its accuracy and versatility. However, its performance can be inconsistent when dealing with certain unexpected shapes , such as shadows and subtle surface irregularities. This limitation raise s concerns about its applicability for fault detection in real world scenarios We aim to overcome these challenges without requiring fine tun ing or labeled data. Our technique divides pictures into smaller windows, which are subsequently processed using SAM. This increases the accuracy of fault identification by focusing on localized details. We compute the sizes of the segmented sections and then us e a clustering technique to discover consistent fault areas while filtering out noise. To further improve the method's robustness , we propose adding the Exponentially Weighted Moving Average (EWMA) technique for continuous monitoring in industrial settings, which would improve the method's capacity to trace faults over time. We compare our method to various well established methods u sing a real case study where our model achieve s 0.96 accuracy compared to 0. 8 5 for the second best method. W e also compare our method us ing two open source datasets where our model attains a consistent 0. 86 accuracy across the datasets compared to 0.53 and 0.54 for second best model s.
- North America > United States > Texas > Denton County > Denton (0.14)
- Oceania > Australia > Western Australia > Perth (0.04)
Sliding Window 3-Objective Pareto Optimization for Problems with Chance Constraints
Multi-objective formulations have been widely used to solve single-objective optimization problems. The initial study carried out by Knowles et al. [8] for the H-IFF and the traveling salesperson problem shows that such formulations can significantly reduce the number of local optima in the search space and uses the term multi-objectivization for such approaches. Using multi-objective formulations to solve constrained single-objective optimization problems by evolutionary multi-objective optimization using the constraint as an additional objective has shown to be highly beneficial for a wide range of problems [4,9,12]. Using the constraint as an additional objective for such problems allows simple evolutionary multi-objective algorithms such as GSEMO mimic a greedy behaviour and as a consequence allows us to achieve theoretically best possible performance guarantees for a wide range of constrained submodular optimization problems [17-19]. Such approaches have been widely studied recently under the term Pareto optimization in the artificial intelligence and machine learning literature [22]. In the context of problems with stochastic constraints, it has recently been shown that 3-objective formulations where the given constraint is relaxed into a third objective lead to better performance than 2-objective formulations that optimize the expected value and variance of the given stochastic components under the given constraint [14, 15].
- Oceania > Australia > South Australia > Adelaide (0.04)
- Europe > Denmark > Capital Region > Kongens Lyngby (0.04)
Maintenance Techniques for Anomaly Detection AIOps Solutions
Poenaru-Olaru, Lorena, Karpova, Natalia, Cruz, Luis, Rellermeyer, Jan, van Deursen, Arie
Anomaly detection techniques are essential in automating the monitoring of IT systems and operations. These techniques imply that machine learning algorithms are trained on operational data corresponding to a specific period of time and that they are continuously evaluated on newly emerging data. Operational data is constantly changing over time, which affects the performance of deployed anomaly detection models. Therefore, continuous model maintenance is required to preserve the performance of anomaly detectors over time. In this work, we analyze two different anomaly detection model maintenance techniques in terms of the model update frequency, namely blind model retraining and informed model retraining. We further investigate the effects of updating the model by retraining it on all the available data (full-history approach) and on only the newest data (sliding window approach). Moreover, we investigate whether a data change monitoring tool is capable of determining when the anomaly detection model needs to be updated through retraining.
- North America > United States > California > San Francisco County > San Francisco (0.28)
- North America > United States > District of Columbia > Washington (0.05)
- Europe > Netherlands > South Holland > Delft (0.04)
- (6 more...)