Apulia
- Europe > Middle East > Malta > Port Region > Southern Harbour District > Floriana (0.04)
- Europe > Austria > Styria > Graz (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Instructional Material (0.46)
- Research Report > New Finding (0.45)
- North America > United States > Pennsylvania (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > Italy > Apulia > Bari (0.04)
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > Middle East > Jordan (0.04)
- (11 more...)
- Overview (0.67)
- Research Report > New Finding (0.46)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- (3 more...)
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (9 more...)
Hypervolume Maximization: A Geometric View of Pareto Set Learning
This paper presents a novel approach to multiobjective algorithms aimed at modeling the Pareto set using neural networks. Whereas previous methods mainly focused on identifying a finite number of solutions, our approach allows for the direct modeling of the entire Pareto set. Furthermore, we establish an equivalence between learning the complete Pareto set and maximizing the associated hypervolume, which enables the convergence analysis of hypervolume (as a new metric) for Pareto set learning. Specifically, our new analysis framework reveals the connection between the learned Pareto solution and its representation in a polar coordinate system. We evaluate our proposed approach on various benchmark problems and real-world problems, and the encouraging results make it a potentially viable alternative to existing multiobjective algorithms.
- North America > United States > California > Los Angeles County > Long Beach (0.14)
- Asia > China > Hong Kong (0.04)
- North America > Canada > British Columbia > Vancouver (0.04)
- (11 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Apulia > Bari (0.04)
Statsformer: Validated Ensemble Learning with LLM-Derived Semantic Priors
Zhang, Erica, Sagan, Naomi, Tse, Danny, Zhang, Fangzhao, Pilanci, Mert, Blanchet, Jose
We introduce Statsformer, a principled framework for integrating large language model (LLM)-derived knowledge into supervised statistical learning. Existing approaches are limited in adaptability and scope: they either inject LLM guidance as an unvalidated heuristic, which is sensitive to LLM hallucination, or embed semantic information within a single fixed learner. Statsformer overcomes both limitations through a guardrailed ensemble architecture. We embed LLM-derived feature priors within an ensemble of linear and nonlinear learners, adaptively calibrating their influence via cross-validation. This design yields a flexible system with an oracle-style guarantee that it performs no worse than any convex combination of its in-library base learners, up to statistical error. Empirically, informative priors yield consistent performance improvements, while uninformative or misspecified LLM guidance is automatically downweighted, mitigating the impact of hallucinations across a diverse range of prediction tasks.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Consensus dimension reduction via multi-view learning
Dimension reduction methods are a fundamental class of techniques in data analysis, which aim to find a lower-dimensional representation of higher-dimensional data while preserving as much of the original information as possible. These methods are extensively used in practice, including in exploratory data analyses to visualize data--arguably, one of the first and most vital steps in any data analysis (Ray et al., 2021). Notably, in genomics, dimension reduction methods are ubiquitously applied to visualize high-dimensional single-cell RNA sequencing data in two dimensions (Becht et al., 2019). Beyond visualization, dimension reduction methods are also frequently employed to mitigate the curse of dimensionality (Bellman, 1957), engineer new features to improve downstream tasks like prediction (e.g., Massy, 1965), and enable scientific discovery in unsupervised learning settings (Chang et al., 2025). For example, many researchers have used dimension reduction in conjunction with clustering to discover new cell types and cell states (Wu et al., 2021), new cancer subtypes (Northcott et al., 2017), and other substantively-meaningful structure in a variety of domains (Bergen et al., 2019; Traven et al., 2017). Given the widespread use and need for dimension reduction methods, numerous dimension reduction techniques have been developed. Popular techniques include but are not limited to principal component analysis (PCA) (Pearson, 1901; Hotelling, 1933), multidimensional scaling (MDS) (Torgerson, 1952; Kruskal, 1964a), Isomap (Tenenbaum et al., 2000), locally linear embedding (LLE) (Roweis and Saul, 2000), t-distributed stochastic neighbor embedding (t-SNE) (van der 1
PinRec: Outcome-Conditioned, Multi-Token Generative Retrieval for Industry-Scale Recommendation Systems
Agarwal, Prabhat, Badrinath, Anirudhan, Bhasin, Laksh, Yang, Jaewon, Botta, Edoardo, Xu, Jiajing, Rosenberg, Charles
Generative retrieval methods utilize generative sequential modeling techniques, such as transformers, to generate candidate items for recommender systems. These methods have demonstrated promising results in academic benchmarks, surpassing traditional retrieval models like two-tower architectures. However, current generative retrieval methods lack the scalability required for industrial recommender systems, and they are insufficiently flexible to satisfy the multiple metric requirements of modern systems. This paper introduces PinRec, a novel generative retrieval model developed for applications at Pinterest. PinRec utilizes outcome-conditioned generation, enabling modelers to specify how to balance various outcome metrics, such as the number of saves and clicks, to effectively align with business goals and user exploration. Additionally, PinRec incorporates multi-token generation to enhance output diversity while optimizing generation. Our experiments demonstrate that PinRec can successfully balance performance, diversity, and efficiency, delivering a significant positive impact to users using generative models. This paper marks a significant milestone in generative retrieval, as it presents, to our knowledge, the first rigorous study on implementing generative retrieval at the scale of Pinterest.
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > California > Santa Clara County > Palo Alto (0.05)
- North America > Puerto Rico > San Juan > San Juan (0.04)
- (2 more...)
- Information Technology > Communications > Social Media (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)