Goto

Collaborating Authors

 hapi


9bcd0bdb2777fe8c729b682f07e993f1-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing Systems

MIRcontains25uniquelabels,andweremoved the label "night" as it is not in the label set of any MLAPIs. For each instance in those datasets, we have evaluated the prediction from the mainstream ML APIs from 2020 to 2022. HAPI was collected from 2020 to 2022. For classification tasks, the predictions/annotations of each API were collected in the spring of 2020, 2021, and 2022. Theoriginal IMDB dataset hasbeenpartitioned into training and testing splits, and thus we used its testing split, including 25,000 textparagraphs.



HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions

Neural Information Processing Systems

Commercial ML APIs offered by providers such as Google, Amazon and Microsoft have dramatically simplified ML adoptions in many applications. Numerous companies and academics pay to use ML APIs for tasks such as object detection, OCR and sentiment analysis. Different ML APIs tackling the same task can have very heterogeneous performances. Moreover, the ML models underlying the APIs also evolve over time. As ML APIs rapidly become a valuable marketplace and an integral part of analytics, it is critical to systematically study and compare different APIs with each other and to characterize how individual APIs change over time. However, this practically important topic is currently underexplored due to the lack of data.




HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions

Neural Information Processing Systems

Commercial ML APIs offered by providers such as Google, Amazon and Microsoft have dramatically simplified ML adoptions in many applications. Numerous companies and academics pay to use ML APIs for tasks such as object detection, OCR and sentiment analysis. Different ML APIs tackling the same task can have very heterogeneous performances. Moreover, the ML models underlying the APIs also evolve over time. As ML APIs rapidly become a valuable marketplace and an integral part of analytics, it is critical to systematically study and compare different APIs with each other and to characterize how individual APIs change over time.


HAPI: A Model for Learning Robot Facial Expressions from Human Preferences

Yang, Dongsheng, Liu, Qianying, Sato, Wataru, Minato, Takashi, Liu, Chaoran, Nishida, Shin'ya

arXiv.org Artificial Intelligence

Automatic robotic facial expression generation is crucial for human-robot interaction, as handcrafted methods based on fixed joint configurations often yield rigid and unnatural behaviors. Although recent automated techniques reduce the need for manual tuning, they tend to fall short by not adequately bridging the gap between human preferences and model predictions-resulting in a deficiency of nuanced and realistic expressions due to limited degrees of freedom and insufficient perceptual integration. In this work, we propose a novel learning-to-rank framework that leverages human feedback to address this discrepancy and enhanced the expressiveness of robotic faces. Specifically, we conduct pairwise comparison annotations to collect human preference data and develop the Human Affective Pairwise Impressions (HAPI) model, a Siamese RankNet-based approach that refines expression evaluation. Results obtained via Bayesian Optimization and online expression survey on a 35-DOF android platform demonstrate that our approach produces significantly more realistic and socially resonant expressions of Anger, Happiness, and Surprise than those generated by baseline and expert-designed methods. This confirms that our framework effectively bridges the gap between human preferences and model predictions while robustly aligning robotic expression generation with human affective responses.


HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions

Neural Information Processing Systems

Commercial ML APIs offered by providers such as Google, Amazon and Microsoft have dramatically simplified ML adoptions in many applications. Numerous companies and academics pay to use ML APIs for tasks such as object detection, OCR and sentiment analysis. Different ML APIs tackling the same task can have very heterogeneous performances. Moreover, the ML models underlying the APIs also evolve over time. As ML APIs rapidly become a valuable marketplace and an integral part of analytics, it is critical to systematically study and compare different APIs with each other and to characterize how individual APIs change over time.


Closing the Gap in High-Risk Pregnancy Care Using Machine Learning and Human-AI Collaboration

Mozannar, Hussein, Utsumi, Yuria, Chen, Irene Y., Gervasi, Stephanie S., Ewing, Michele, Smith-McLallen, Aaron, Sontag, David

arXiv.org Artificial Intelligence

High-risk pregnancy (HRP) is a pregnancy complicated by factors that can adversely affect outcomes of the mother or the infant. Health insurers use algorithms to identify members who would benefit from additional clinical support. We aimed to build machine learning algorithms to identify pregnant patients and triage them by risk of complication to assist care management. In this retrospective study, we trained a hybrid Lasso regularized classifier to predict whether a patient is currently pregnant using claims data from 36735 insured members of Independence Blue Cross (IBC), a health insurer in Philadelphia. We then train a linear classifier on a subset of 12,243 members to predict whether a patient will develop gestational diabetes or gestational hypertension. These algorithms were developed in cooperation with the care management team at IBC and integrated into the dashboard. In small user studies with the nurses, we evaluated the impact of integrating our algorithms into their workflow. We find that the proposed model predicts an earlier pregnancy start date for 3.54% (95% CI 3.05-4.00) for patients with complications compared to only using a set of pre-defined codes that indicate the start of pregnancy and never later at the expense of a 5.58% (95% CI 4.05-6.40) false positive rate. The classifier for predicting complications has an AUC of 0.754 (95% CI 0.764-0.788) using data up to the patient's first trimester. Nurses from the care management program expressed a preference for the proposed models over existing approaches. The proposed model outperformed commonly used claim codes for the identification of pregnant patients at the expense of a manageable false positive rate. Our risk complication classifier shows that we can accurately triage patients by risk of complication.


Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores

Guirguis, Arsany, Petrescu, Diana, Dinu, Florin, Quoc, Do Le, Picorel, Javier, Guerraoui, Rachid

arXiv.org Artificial Intelligence

Near-data computation techniques have been successfully deployed to mitigate the cloud network bottleneck between the storage and compute tiers. At Huawei, we are currently looking to get more value from these techniques by broadening their applicability. Machine learning (ML) applications are an appealing and timely target. This paper describes our experience applying near-data computation techniques to transfer learning (TL), a widely popular ML technique, in the context of disaggregated cloud object stores. Our techniques benefit both cloud providers and users. They improve our operational efficiency while providing users the performance improvements they demand from us. The main practical challenge to consider is that the storage-side computational resources are limited. Our approach is to split the TL deep neural network (DNN) during the feature extraction phase, before the training phase. This reduces the network transfers to the compute tier and further decouples the batch size of feature extraction from the training batch size. This facilitates our second technique, storage-side batch adaptation, which enables increased concurrency in the storage tier while avoiding out-of-memory errors. Guided by these insights, we present HAPI, our processing system for TL that spans the compute and storage tiers while remaining transparent to the user. Our evaluation with several state-of-the-art DNNs, such as ResNet, VGG, and Transformer, shows up to 11x improvement in application runtime and up to 8.3x reduction in the data transferred from the storage to the compute tier compared to running the computation entirely in the compute tier.