Gomes, Ryan G.
Towards a Personal Health Large Language Model
Cosentino, Justin, Belyaeva, Anastasiya, Liu, Xin, Furlotte, Nicholas A., Yang, Zhun, Lee, Chace, Schenck, Erik, Patel, Yojan, Cui, Jian, Schneider, Logan Douglas, Bryant, Robby, Gomes, Ryan G., Jiang, Allen, Lee, Roy, Liu, Yun, Perez, Javier, Rogers, Jameson K., Speed, Cathy, Tailor, Shyam, Walker, Megan, Yu, Jeffrey, Althoff, Tim, Heneghan, Conor, Hernandez, John, Malhotra, Mark, Stern, Leor, Matias, Yossi, Corrado, Greg S., Patel, Shwetak, Shetty, Shravya, Zhan, Jiening, Prabhakara, Shruthi, McDuff, Daniel, McLean, Cory Y.
In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We created and curated three datasets that test 1) production of personalized insights and recommendations from sleep patterns, physical activity, and physiological responses, 2) expert domain knowledge, and 3) prediction of self-reported sleep outcomes. For the first task we designed 857 case studies in collaboration with domain experts to assess real-world scenarios in sleep and fitness. Through comprehensive evaluation of domain-specific rubrics, we observed that Gemini Ultra 1.0 and PH-LLM are not statistically different from expert performance in fitness and, while experts remain superior for sleep, fine-tuning PH-LLM provided significant improvements in using relevant domain knowledge and personalizing information for sleep insights. We evaluated PH-LLM domain knowledge using multiple choice sleep medicine and fitness examinations. PH-LLM achieved 79% on sleep and 88% on fitness, exceeding average scores from a sample of human experts. Finally, we trained PH-LLM to predict self-reported sleep quality outcomes from textual and multimodal encoding representations of wearable data, and demonstrate that multimodal encoding is required to match performance of specialized discriminative models. Although further development and evaluation are necessary in the safety-critical personal health domain, these results demonstrate both the broad knowledge and capabilities of Gemini models and the benefit of contextualizing physiological data for personal health applications as done with PH-LLM.
Discriminative Clustering by Regularized Information Maximization
Krause, Andreas, Perona, Pietro, Gomes, Ryan G.
Is there a principled way to learn a probabilistic discriminative classifier from an unlabeled data set? We present a framework that simultaneously clusters the data and trains a discriminative classifier. We call it Regularized Information Maximization (RIM). The approach can flexibly incorporate different likelihood functions, express prior assumptions about the relative size of different classes and incorporate partial labels for semi-supervised learning. Our empirical evaluation indicates that RIM outperforms existing methods on several real data sets, and demonstrates that RIM is an effective model selection method.
Crowdclustering
Gomes, Ryan G., Welinder, Peter, Krause, Andreas, Perona, Pietro
Is it possible to crowdsource categorization? Amongst the challenges: (a) each annotator has only a partial view of the data, (b) different annotators may have different clustering criteria and may produce different numbers of categories, (c) the underlying category structure may be hierarchical. We propose a Bayesian model of how annotators may approach clustering and show how one may infer clusters/categories, as well as annotator parameters, using this model. Our experiments, carried out on large collections of images, suggest that Bayesian crowdclustering works well and may be superior to single-expert annotations.
Discriminative Clustering by Regularized Information Maximization
Krause, Andreas, Perona, Pietro, Gomes, Ryan G.
Is there a principled way to learn a probabilistic discriminative classifier from an unlabeled data set? We present a framework that simultaneously clusters the data and trains a discriminative classifier. We call it Regularized Information Maximization (RIM). RIM optimizes an intuitive information-theoretic objective function which balances class separation, class balance and classifier complexity. The approach can flexibly incorporate different likelihood functions, express prior assumptions about the relative size of different classes and incorporate partial labels for semi-supervised learning. In particular, we instantiate the framework to unsupervised, multi-class kernelized logistic regression. Our empirical evaluation indicates that RIM outperforms existing methods on several real data sets, and demonstrates that RIM is an effective model selection method.