Goto

Collaborating Authors

 flam


Multi-Loss Learning for Speech Emotion Recognition with Energy-Adaptive Mixup and Frame-Level Attention

arXiv.org Artificial Intelligence

Speech emotion recognition (SER) is an important technology in human-computer interaction. However, achieving high performance is challenging due to emotional complexity and scarce annotated data. To tackle these challenges, we propose a multi-loss learning (MLL) framework integrating an energy-adaptive mixup (EAM) method and a frame-level attention module (FLAM). The EAM method leverages SNR-based augmentation to generate diverse speech samples capturing subtle emotional variations. FLAM enhances frame-level feature extraction for multi-frame emotional cues. Our MLL strategy combines Kullback-Leibler divergence, focal, center, and supervised contrastive loss to optimize learning, address class imbalance, and improve feature separability. We evaluate our method on four widely used SER datasets: IEMOCAP, MSP-IMPROV, RAVDESS, and SAVEE. The results demonstrate our method achieves state-of-the-art performance, suggesting its effectiveness and robustness.


FinGEAR: Financial Mapping-Guided Enhanced Answer Retrieval

arXiv.org Artificial Intelligence

Financial disclosures such as 10-K filings present challenging retrieval problems due to their length, regulatory section hierarchy, and domain-specific language, which standard retrieval-augmented generation (RAG) models underuse. We introduce FinGEAR (Financial Mapping-Guided Enhanced Answer Retrieval), a retrieval framework tailored to financial documents. FinGEAR combines a finance lexicon for Item-level guidance (FLAM), dual hierarchical indices for within-Item search (Summary Tree and Question Tree), and a two-stage cross-encoder reranker. This design aligns retrieval with disclosure structure and terminology, enabling fine-grained, query-aware context selection. Evaluated on full 10-Ks with queries aligned to the FinQA dataset, FinGEAR delivers consistent gains in precision, recall, F1, and relevancy, improving F1 by up to 56.7% over flat RAG, 12.5% over graph-based RAGs, and 217.6% over prior tree-based systems, while also increasing downstream answer accuracy with a fixed reader. By jointly modeling section hierarchy and domain lexicon signals, FinGEAR improves retrieval fidelity and provides a practical foundation for high-stakes financial analysis.


FLAM: Foundation Model-Based Body Stabilization for Humanoid Locomotion and Manipulation

arXiv.org Artificial Intelligence

Humanoid robots have attracted significant attention in recent years. Reinforcement Learning (RL) is one of the main ways to control the whole body of humanoid robots. RL enables agents to complete tasks by learning from environment interactions, guided by task rewards. However, existing RL methods rarely explicitly consider the impact of body stability on humanoid locomotion and manipulation. Achieving high performance in whole-body control remains a challenge for RL methods that rely solely on task rewards. In this paper, we propose a Foundation model-based method for humanoid Locomotion And Manipulation (FLAM for short). FLAM integrates a stabilizing reward function with a basic policy. The stabilizing reward function is designed to encourage the robot to learn stable postures, thereby accelerating the learning process and facilitating task completion. Specifically, the robot pose is first mapped to the 3D virtual human model. Then, the human pose is stabilized and reconstructed through a human motion reconstruction model. Finally, the pose before and after reconstruction is used to compute the stabilizing reward. By combining this stabilizing reward with the task reward, FLAM effectively guides policy learning. Experimental results on a humanoid robot benchmark demonstrate that FLAM outperforms state-of-the-art RL methods, highlighting its effectiveness in improving stability and overall performance.


Fused Lasso Additive Model

arXiv.org Machine Learning

Ashley Petersen, Daniela Witten†, and Noah Simon ‡ Department of Biostatistics, University of Washington, Seattle W A 98195 March 29, 2018 We consider the problem of predicting an outcome variable usingp covariates that are measured onn independent observations, in the setting in which flexible and interpretable fits are desirable. We propose the fused lasso additive model (FLAM), in which each additive function is estimated to be piecewise constant with a small number of adaptively-chosen knots. FLAM is the solution to a convex optimization problem, for which a simple algorithm with guaranteed convergence to the global optimum is provided. FLAM is shown to be consistent in high dimensions, and an unbiased estimator of its degrees of freedom is proposed. We evaluate the performance of FLAM in a simulation study and on two data sets. Keywords: additive model, feature selection, high-dimensional, nonparametric regression, piecewise constant, sparsity 1 Introduction In this paper, we consider the task of predicting a response variable usingp features measured on n independent observations. In this paper, we propose a method that balances the tradeoff between inter-pretability and flexibility, while also allowing for sparsity in high dimensions whenp n . It selects a subset of features to include in the model, and for these features it fits piecewise constant functions with knots that are chosen adaptively based on the data. We now introduce some notation. We letX denote ann p matrix, for whichx j is the j th column (feature), and for which thei th element (observation) isx ij .