Goto

Collaborating Authors

 hugging face model


Exploring the Carbon Footprint of Hugging Face's ML Models: A Repository Mining Study

Castaño, Joel, Martínez-Fernández, Silverio, Franch, Xavier, Bogner, Justus

arXiv.org Machine Learning

The rise of machine learning (ML) systems has exacerbated their carbon footprint due to increased capabilities and model sizes. However, there is scarce knowledge on how the carbon footprint of ML models is actually measured, reported, and evaluated. In light of this, the paper aims to analyze the measurement of the carbon footprint of 1,417 ML models and associated datasets on Hugging Face, which is the most popular repository for pretrained ML models. The goal is to provide insights and recommendations on how to report and optimize the carbon efficiency of ML models. The study includes the first repository mining study on the Hugging Face Hub API on carbon emissions. This study seeks to answer two research questions: (1) how do ML model creators measure and report carbon emissions on Hugging Face Hub?, and (2) what aspects impact the carbon emissions of training ML models? The study yielded several key findings. These include a stalled proportion of carbon emissions-reporting models, a slight decrease in reported carbon footprint on Hugging Face over the past 2 years, and a continued dominance of NLP as the main application domain. Furthermore, the study uncovers correlations between carbon emissions and various attributes such as model size, dataset size, and ML application domains. These results highlight the need for software measurements to improve energy reporting practices and promote carbon-efficient model development within the Hugging Face community. In response to this issue, two classifications are proposed: one for categorizing models based on their carbon emission reporting practices and another for their carbon efficiency. The aim of these classification proposals is to foster transparency and sustainable model development within the ML community.


Importing Hugging Face models into Spark NLP

#artificialintelligence

Let's suppose I have taken a look at Hugging Face Models Hub (https://huggingface.co/models), and I have detected 7 models I want to import into Spark NLP, for BertForSequenceClassification: Since the steps are more or less the same as described in first example, I'm going to automatize in a loop all the steps from downloading, to importing into SparkNLP and inferring, to illustrate an end-to-end import. There is one extra step we need to carry out when importing Classifiers from Hugging Face: we need a labels.txt, That file can be created using the config.json However, we may find models without that field, what leads to using just numeric values for the labels, what is not very user friendly. To support both importing from config.json and creating our own labels, let's declare an array: If the value is None, then we will import the tags from the model.


Identify paraphrased text with Hugging Face on Amazon SageMaker

#artificialintelligence

Identifying paraphrased text has business value in many use cases. For example, by identifying sentence paraphrases, a text summarization system could remove redundant information. Another application is to identify plagiarized documents. In this post, we fine-tune a Hugging Face transformer on Amazon SageMaker to identify paraphrased sentence pairs in a few steps. A truly robust model can identify paraphrased text when the language used may be completely different, and also identify differences when the language used has high lexical overlap.


Announcing managed inference for Hugging Face models in Amazon SageMaker

#artificialintelligence

Hugging Face is the technology startup, with an active open-source community, that drove the worldwide adoption of transformer-based models thanks to its eponymous Transformers library. Earlier this year, Hugging Face and AWS collaborated to enable you to train and deploy over 10,000 pre-trained models on Amazon SageMaker. For more information on training Hugging Face models at scale on SageMaker, refer to AWS and Hugging Face collaborate to simplify and accelerate adoption of Natural Language Processing models and the sample notebooks. In this post, we discuss different methods to create a SageMaker endpoint for a Hugging Face model. If you're unfamiliar with transformer-based models and their place in the natural language processing (NLP) landscape, here is an overview.


The Second Conversational Intelligence Challenge (ConvAI2)

Dinan, Emily, Logacheva, Varvara, Malykh, Valentin, Miller, Alexander, Shuster, Kurt, Urbanek, Jack, Kiela, Douwe, Szlam, Arthur, Serban, Iulian, Lowe, Ryan, Prabhumoye, Shrimai, Black, Alan W, Rudnicky, Alexander, Williams, Jason, Pineau, Joelle, Burtsev, Mikhail, Weston, Jason

arXiv.org Artificial Intelligence

We describe the setting and results of the ConvAI2 NeurIPS competition that aims to further the state-of-the-art in open-domain chatbots. Some key takeaways from the competition are: (i) pretrained Transformer variants are currently the best performing models on this task, (ii) but to improve performance on multi-turn conversations with humans, future systems must go beyond single word metrics like perplexity to measure the performance across sequences of utterances (conversations) in terms of repetition, consistency and balance of dialogue acts (e.g. The Conversational Intelligence Challenge aims at finding approaches to creating highquality dialogue agents capable of meaningful open domain conversation. Today, the progress in the field is significantly hampered by the absence of established benchmark tasks for non-goal-oriented dialogue systems (chatbots) and solid evaluation criteria for automatic assessment of dialogue quality. The aim of this competition was therefore to establish a concrete scenario for testing chatbots that aim to engage humans, and become a standard evaluation tool in order to make such systems directly comparable, including open source datasets, evaluation code (both automatic evaluations and code to run the human evaluation on Mechanical Turk), model baselines and the winning model itself. Taking into account the results of the previous edition, this year we improved the task, the evaluation process, and the human conversationalists' experience. We did this in part by making the setup simpler for the competitors, and in part by making the conversations more engaging for humans. We provided a dataset from the beginning, Persona-Chat, whose training set consists of conversations between crowdworkers who were randomly paired and asked to act the part of a given provided persona (randomly assigned, and created by another set of crowdworkers). The paired workers were asked to chat naturally and to get to know each other during the conversation. This produces interesting and engaging conversations that learning agents can try to mimic.