Medioni, Gerard
Shape of You: Precise 3D shape estimations for diverse body types
Sarkar, Rohan, Dave, Achal, Medioni, Gerard, Biggs, Benjamin
This paper presents Shape of You (SoY), an approach to improve the accuracy of 3D body shape estimation for vision-based clothing recommendation systems. While existing methods have successfully estimated 3D poses, there remains a lack of work in precise shape estimation, particularly for diverse human bodies. To address this gap, we propose two loss functions that can be readily integrated into parametric 3D human reconstruction pipelines. Additionally, we propose a test-time optimization routine that further improves quality. Our method improves over the recent SHAPY method by 17.7% on the challenging SSP-3D dataset. We consider our work to be a step towards a more accurate 3D shape estimation system that works reliably on diverse body types and holds promise for practical applications in the fashion industry.
GPT4Rec: A Generative Framework for Personalized Recommendation and User Interests Interpretation
Li, Jinming, Zhang, Wentao, Wang, Tian, Xiong, Guanglei, Lu, Alan, Medioni, Gerard
Recent advancements in Natural Language Processing (NLP) have led to the development of NLP-based recommender systems that have shown superior performance. However, current models commonly treat items as mere IDs and adopt discriminative modeling, resulting in limitations of (1) fully leveraging the content information of items and the language modeling capabilities of NLP models; (2) interpreting user interests to improve relevance and diversity; and (3) adapting practical circumstances such as growing item inventories. To address these limitations, we present GPT4Rec, a novel and flexible generative framework inspired by search engines. It first generates hypothetical "search queries" given item titles in a user's history, and then retrieves items for recommendation by searching these queries. The framework overcomes previous limitations by learning both user and item embeddings in the language space. To well-capture user interests with different aspects and granularity for improving relevance and diversity, we propose a multi-query generation technique with beam search. The generated queries naturally serve as interpretable representations of user interests and can be searched to recommend cold-start items. With GPT-2 language model and BM25 search engine, our framework outperforms state-of-the-art methods by $75.7\%$ and $22.2\%$ in Recall@K on two public datasets. Experiments further revealed that multi-query generation with beam search improves both the diversity of retrieved items and the coverage of a user's multi-interests. The adaptiveness and interpretability of generated queries are discussed with qualitative case studies.
Graph-Based Manifold Frequency Analysis for Denoising
Deutsch, Shay, Ortega, Antonio, Medioni, Gerard
We propose a new framework for manifold denoising based on processing in the graph Fourier frequency domain, derived from the spectral decomposition of the discrete graph Laplacian. Our approach uses the Spectral Graph Wavelet transform in order to per- form non-iterative denoising directly in the graph frequency domain, an approach inspired by conventional wavelet-based signal denoising methods. We theoretically justify our approach, based on the fact that for smooth manifolds the coordinate information energy is localized in the low spectral graph wavelet sub-bands, while the noise affects all frequency bands in a similar way. Experimental results show that our proposed manifold frequency denoising (MFD) approach significantly outperforms the state of the art denoising meth- ods, and is robust to a wide range of parameter selections, e.g., the choice of k nearest neighbor connectivity of the graph.
Robust Multiple Manifolds Structure Learning
Gong, Dian, Zhao, Xuemei, Medioni, Gerard
We present a robust multiple manifolds structure learning (RMMSL) scheme to robustly estimate data structures under the multiple low intrinsic dimensional manifolds assumption. In the local learning stage, RMMSL efficiently estimates local tangent space by weighted low-rank matrix factorization. In the global learning stage, we propose a robust manifold clustering method based on local structure learning results. The proposed clustering method is designed to get the flattest manifolds clusters by introducing a novel curved-level similarity function. Our approach is evaluated and compared to state-of-the-art methods on synthetic data, handwritten digit images, human motion capture data and motorbike videos. We demonstrate the effectiveness of the proposed approach, which yields higher clustering accuracy, and produces promising results for challenging tasks of human motion segmentation and motion flow learning from videos.