test
- North America > United States > Oregon > Multnomah County > Portland (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
- Asia > China > Beijing > Beijing (0.05)
- North America > United States > Virginia (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Is Your Imitation Learning Policy Better than Mine? Policy Comparison with Near-Optimal Stopping
Snyder, David, Hancock, Asher James, Badithela, Apurva, Dixon, Emma, Miller, Patrick, Ambrus, Rares Andrei, Majumdar, Anirudha, Itkina, Masha, Nishimura, Haruki
Imitation learning has enabled robots to perform complex, long-horizon tasks in challenging dexterous manipulation settings. As new methods are developed, they must be rigorously evaluated and compared against corresponding baselines through repeated evaluation trials. However, policy comparison is fundamentally constrained by a small feasible sample size (e.g., 10 or 50) due to significant human effort and limited inference throughput of policies. This paper proposes a novel statistical framework for rigorously comparing two policies in the small sample size regime. Prior work in statistical policy comparison relies on batch testing, which requires a fixed, pre-determined number of trials and lacks flexibility in adapting the sample size to the observed evaluation data. Furthermore, extending the test with additional trials risks inducing inadvertent p-hacking, undermining statistical assurances. In contrast, our proposed statistical test is sequential, allowing researchers to decide whether or not to run more trials based on intermediate results. This adaptively tailors the number of trials to the difficulty of the underlying comparison, saving significant time and effort without sacrificing probabilistic correctness. Extensive numerical simulation and real-world robot manipulation experiments show that our test achieves near-optimal stopping, letting researchers stop evaluation and make a decision in a near-minimal number of trials. Specifically, it reduces the number of evaluation trials by up to 40% as compared to state-of-the-art baselines, while preserving the probabilistic correctness and statistical power of the comparison. Moreover, our method is strongest in the most challenging comparison instances (requiring the most evaluation trials); in a multi-task comparison scenario, we save the evaluator more than 200 simulation rollouts.
- North America > United States (0.28)
- Europe > Germany (0.14)
- Information Technology > Artificial Intelligence > Robots (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)
OLMD: Orientation-aware Long-term Motion Decoupling for Continuous Sign Language Recognition
Yu, Yiheng, Liu, Sheng, Feng, Yuan, Xu, Min, Jin, Zhelun, Yang, Xuhua
The primary challenge in continuous sign language recognition (CSLR) mainly stems from the presence of multi-orientational and long-term motions. However, current research overlooks these crucial aspects, significantly impacting accuracy. To tackle these issues, we propose a novel CSLR framework: Orientation-aware Long-term Motion Decoupling (OLMD), which efficiently aggregates long-term motions and decouples multi-orientational signals into easily interpretable components. Specifically, our innovative Long-term Motion Aggregation (LMA) module filters out static redundancy while adaptively capturing abundant features of long-term motions. We further enhance orientation awareness by decoupling complex movements into horizontal and vertical components, allowing for motion purification in both orientations. Additionally, two coupling mechanisms are proposed: stage and cross-stage coupling, which together enrich multi-scale features and improve the generalization capabilities of the model. Experimentally, OLMD shows SOTA performance on three large-scale datasets: PHOENIX14, PHOENIX14-T, and CSL-Daily. Notably, we improved the word error rate (WER) on PHOENIX14 by an absolute 1.6% compared to the previous SOTA
- Europe > Switzerland (0.14)
- Asia > China (0.14)
Sample size determination for machine learning in medical research
Arifin, Wan Nor, Yaacob, Najib Majdi
Machine learning (ML) methods are being increasingly used across various domains of medicine research. However, despite advancements in the use of ML in medicine, clear and definitive guidelines for determining sample sizes in medical ML research are lacking. This article proposes a method for determining sample sizes for medical research utilizing ML methods, beginning with the determination of the testing set sample size, followed with the determination of the training set and total sample sizes. Introduction Machine learning (ML) methods are being increasingly used in medical research, spanning various domains of medicine from oncology, orthopaedics, ophthalmology and general practice (Sirocchi et al., 2024). However, despite this advancement in medical research, currently there are no clear and definitive guidelines for determining sample sizes when using ML methods in the medical domain.
Chance-Constrained Sampling-Based MPC for Collision Avoidance in Uncertain Dynamic Environments
Mohamed, Ihab S., Ali, Mahmoud, Liu, Lantao
Navigating safely in dynamic and uncertain environments is challenging due to uncertainties in perception and motion. This letter presents C2U-MPPI, a robust sampling-based Model Predictive Control (MPC) framework that addresses these challenges by leveraging the Unscented Model Predictive Path Integral (U-MPPI) control strategy with integrated probabilistic chance constraints, ensuring more reliable and efficient navigation under uncertainty. Unlike gradient-based MPC methods, our approach (i) avoids linearization of system dynamics and directly applies non-convex and nonlinear chance constraints, enabling more accurate and flexible optimization, and (ii) enhances computational efficiency by reformulating probabilistic constraints into a deterministic form and employing a layered dynamic obstacle representation, enabling real-time handling of multiple obstacles. Extensive experiments in simulated and real-world human-shared environments validate the effectiveness of our algorithm against baseline methods, showcasing its capability to generate feasible trajectories and control inputs that adhere to system dynamics and constraints in dynamic settings, enabled by unscented-based sampling strategy and risk-sensitive trajectory evaluation. A supplementary video is available at: https://youtu.be/FptAhvJlQm8
Enhancing Multivariate Time Series-based Solar Flare Prediction with Multifaceted Preprocessing and Contrastive Learning
EskandariNasab, MohammadReza, Hamdi, Shah Muhammad, Boubrahimi, Soukaina Filali
Accurate solar flare prediction is crucial due to the significant risks that intense solar flares pose to astronauts, space equipment, and satellite communication systems. Our research enhances solar flare prediction by utilizing advanced data preprocessing and classification methods on a multivariate time series-based dataset of photospheric magnetic field parameters. First, our study employs a novel preprocessing pipeline that includes missing value imputation, normalization, balanced sampling, near decision boundary sample removal, and feature selection to significantly boost prediction accuracy. Second, we integrate contrastive learning with a GRU regression model to develop a novel classifier, termed ContReg, which employs dual learning methodologies, thereby further enhancing prediction performance. To validate the effectiveness of our preprocessing pipeline, we compare and demonstrate the performance gain of each step, and to demonstrate the efficacy of the ContReg classifier, we compare its performance to that of sequence-based deep learning architectures, machine learning models, and findings from previous studies. Our results illustrate exceptional True Skill Statistic (TSS) scores, surpassing previous methods and highlighting the critical role of precise data preprocessing and classifier development in time series-based solar flare prediction.
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.46)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.46)
- Energy > Oil & Gas > Midstream (0.46)
A Paradigm for Potential Model Performance Improvement in Classification and Regression Problems. A Proof of Concept
Lobo-Cabrera, Francisco Javier
Binary classification, multilabel classification, and regression prediction constitute fundamental paradigms in machine learning, addressing distinct types of predictive modeling tasks. Binary classification involves categorizing instances into one of two classes, typically denoted as positive and negative [1][2][3]. This modeling framework is particularly applicable to scenarios where outcomes are binary in nature, as observed in domains such as spam detection and medical diagnosis. In multilabel classification, the scope extends to situations where instances can be associated with multiple classes simultaneously, a common occurrence in applications like image tagging and document categorization [1][4]. Conversely, regression prediction is concerned with forecasting continuous outcomes, aiming to predict numeric values [3].
Quantitative Analysis of Forecasting Models:In the Aspect of Online Political Bias
Tripuraneni, Srinath Sai, Kamal, Sadia, Bagavathi, Arunkumar
Understanding and mitigating political bias in online social media platforms are crucial tasks to combat misinformation and echo chamber effects. However, characterizing political bias temporally using computational methods presents challenges due to the high frequency of noise in social media datasets. While existing research has explored various approaches to political bias characterization, the ability to forecast political bias and anticipate how political conversations might evolve in the near future has not been extensively studied. In this paper, we propose a heuristic approach to classify social media posts into five distinct political leaning categories. Since there is a lack of prior work on forecasting political bias, we conduct an in-depth analysis of existing baseline models to identify which model best fits to forecast political leaning time series. Our approach involves utilizing existing time series forecasting models on two social media datasets with different political ideologies, specifically Twitter and Gab. Through our experiments and analyses, we seek to shed light on the challenges and opportunities in forecasting political bias in social media platforms. Ultimately, our work aims to pave the way for developing more effective strategies to mitigate the negative impact of political bias in the digital realm.
- Media > News (1.00)
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.68)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.68)
- Energy > Oil & Gas > Midstream (0.68)
Assumptions of Linear Regression -- What Fellow Data Scientists Should Know
Linear Regression is a linear approach to modeling the relationship between a target variable and one or more independent variables. This modeled relationship is then used for predictive analytics. Working on the linear regression algorithm is just half the work done. For linear regression to work on the given data, it is assumed that Errors (residuals) follow a normal distribution. Although this is not necessarily required when the sample size is very large.