Goto

Collaborating Authors

 actual label


Iceberg: Enhancing HLS Modeling with Synthetic Data

Ding, Zijian, Nguyen, Tung, Li, Weikai, Grover, Aditya, Sun, Yizhou, Cong, Jason

arXiv.org Artificial Intelligence

Deep learning-based prediction models for High-Level Synthesis (HLS) of hardware designs often struggle to generalize. In this paper, we study how to close the generalizability gap of these models through pretraining on synthetic data and introduce Iceberg, a synthetic data augmentation approach that expands both large language model (LLM)-generated programs and weak labels of unseen design configurations. Our weak label generation method is integrated with an in-context model architecture, enabling meta-learning from actual and proximate labels. Iceberg improves the geometric mean modeling accuracy by $86.4\%$ when adapt to six real-world applications with few-shot examples and achieves a $2.47\times$ and a $1.12\times$ better offline DSE performance when adapting to two different test datasets. Our open-sourced code is here: https://github.com/UCLA-VAST/iceberg


Domain Adaptive Skin Lesion Classification via Conformal Ensemble of Vision Transformers

Zoravar, Mehran, Alijani, Shadi, Najjaran, Homayoun

arXiv.org Artificial Intelligence

Exploring the trustworthiness of deep learning models is crucial, especially in critical domains such as medical imaging decision support systems. Conformal prediction has emerged as a rigorous means of providing deep learning models with reliable uncertainty estimates and safety guarantees. However, conformal prediction results face challenges due to the backbone model's struggles in domain-shifted scenarios, such as variations in different sources. To aim this challenge, this paper proposes a novel framework termed Conformal Ensemble of Vision Transformers (CE-ViTs) designed to enhance image classification performance by prioritizing domain adaptation and model robustness, while accounting for uncertainty. The proposed method leverages an ensemble of vision transformer models in the backbone, trained on diverse datasets including HAM10000, Dermofit, and Skin Cancer ISIC datasets. This ensemble learning approach, calibrated through the combined mentioned datasets, aims to enhance domain adaptation through conformal learning. Experimental results underscore that the framework achieves a high coverage rate of 90.38\%, representing an improvement of 9.95\% compared to the HAM10000 model. This indicates a strong likelihood that the prediction set includes the true label compared to singular models. Ensemble learning in CE-ViTs significantly improves conformal prediction performance, increasing the average prediction set size for challenging misclassified samples from 1.86 to 3.075.


Arabic Tweet Act: A Weighted Ensemble Pre-Trained Transformer Model for Classifying Arabic Speech Acts on Twitter

Alshehri, Khadejaa, Alhothali, Areej, Alowidi, Nahed

arXiv.org Artificial Intelligence

Speech acts are a speakers actions when performing an utterance within a conversation, such as asking, recommending, greeting, or thanking someone, expressing a thought, or making a suggestion. Understanding speech acts helps interpret the intended meaning and actions behind a speakers or writers words. This paper proposes a Twitter dialectal Arabic speech act classification approach based on a transformer deep learning neural network. Twitter and social media, are becoming more and more integrated into daily life. As a result, they have evolved into a vital source of information that represents the views and attitudes of their users. We proposed a BERT based weighted ensemble learning approach to integrate the advantages of various BERT models in dialectal Arabic speech acts classification. We compared the proposed model against several variants of Arabic BERT models and sequence-based models. We developed a dialectal Arabic tweet act dataset by annotating a subset of a large existing Arabic sentiment analysis dataset (ASAD) based on six speech act categories. We also evaluated the models on a previously developed Arabic Tweet Act dataset (ArSAS). To overcome the class imbalance issue commonly observed in speech act problems, a transformer-based data augmentation model was implemented to generate an equal proportion of speech act categories. The results show that the best BERT model is araBERTv2-Twitter models with a macro-averaged F1 score and an accuracy of 0.73 and 0.84, respectively. The performance improved using a BERT-based ensemble method with a 0.74 and 0.85 averaged F1 score and accuracy on our dataset, respectively.


Efficacy of Machine-Generated Instructions

Gulati, Samaksh, Verma, Anshit, Parmar, Manoj, Chaudhary, Palash

arXiv.org Artificial Intelligence

Large "instruction-tuned" language models (i.e., finetuned to respond to instructions) have demonstrated a remarkable ability to generalize zero-shot to new tasks. Nevertheless, they depend heavily on human-written instruction data that is often limited in quantity, diversity, and creativity, therefore hindering the generality of the tuned model. We conducted a quantitative study to figure out the efficacy of machine generated annotations, where we compare the results of a fine-tuned BERT model with human v/s machine-generated annotations. Applying our methods to the vanilla GPT-3 model, we saw that machinegenerated annotations were 78.54% correct and the fine-tuned model achieved a 96.01% This result shows that machine-generated annotations are an resource and cost effective way to fine-tune down-stream models.


Confidence Is All You Need for MI Attacks

Sinha, Abhishek, Tibrewal, Himanshi, Gupta, Mansi, Waghela, Nikhar, Garg, Shivank

arXiv.org Artificial Intelligence

In this evolving era of machine learning security, membership inference attacks have emerged as a potent threat to the confidentiality of sensitive data. In this attack, adversaries aim to determine whether a particular point was used during the training of a target model. This paper proposes a new method to gauge a data point's membership in a model's training set. Instead of correlating loss with membership, as is traditionally done, we have leveraged the fact that training examples generally exhibit higher confidence values when classified into their actual class. During training, the model is essentially being 'fit' to the training data and might face particular difficulties in generalization to unseen data. This asymmetry leads to the model achieving higher confidence on the training data as it exploits the specific patterns and noise present in the training data. Our proposed approach leverages the confidence values generated by the machine learning model. These confidence values provide a probabilistic measure of the model's certainty in its predictions and can further be used to infer the membership of a given data point. Additionally, we also introduce another variant of our method that allows us to carry out this attack without knowing the ground truth(true class) of a given data point, thus offering an edge over existing label-dependent attack methods.


An easy to follow guide to fine-tune your first HuggingFace model

#artificialintelligence

We have consequently created a summarizer model, and in a similar fashion can create models for Text Classification, Token Classification, Question Answering, Language Modelling, Translation, etc. I urge you to follow the transformers documentation to try out more tasks and other models as well.


Explainable Patterns for Distinction and Prediction of Moral Judgement on Reddit

Efstathiadis, Ion Stagkos, Paulino-Passos, Guilherme, Toni, Francesca

arXiv.org Artificial Intelligence

The forum r/AmITheAsshole in Reddit hosts discussion on moral issues based on concrete narratives presented by users. Existing analysis of the forum focuses on its comments, and does not make the underlying data publicly available. In this paper we build a new dataset of comments and also investigate the classification of the posts in the forum. Further, we identify textual patterns associated with the provocation of moral judgement by posts, with the expression of moral stance in comments, and with the decisions of trained classifiers of posts and comments.


Introduction to PyTorch

#artificialintelligence

Recently, Microsoft and PyTorch announced a "PyTorch Fundamentals" tutorial, which you can find on Microsoft's site and on PyTorch's site. The code in this post is based on the code appearing in that tutorial, and forms the foundation for a series of other posts, where I'll explore other machine learning frameworks and show integration with Azure ML. In this post, I'll explain how you can create a basic neural network in PyTorch, using the Fashion MNIST dataset as a data source. The neural network we'll build takes as input images of clothing, and classifies them according to their contents, such as "Shirt," "Coat," or "Dress." I'll assume that you have a basic conceptual understanding of neural networks, and that you're comfortable with Python, but I assume no knowledge of PyTorch. Let's start by getting familiar with the data we'll be using, the Fashion MNIST dataset. This dataset contains 70,000 grayscale images of articles of clothing -- 60,000 meant to be used for training and 10,000 meant for testing.


Loss Functions

#artificialintelligence

Today is a new day, a day of adventure and mountain climbing! So like the good student you are, you attended today's class but didn't understand:( Luckily, you got me, your personal professor. I asked your classmates about today's class and they told me that the professor taught you about Loss Functions, some even told me that he taught them how to climb down from different mountains. Well, grab your hiking gear and follow my lead, we are going to climb down from a high mountain, higher than Everest itself. Do you remember that the objective of training the neural network is to try to minimize the loss between the predictions and the actual values?


Loss Functions

#artificialintelligence

Today is a new day, a day of adventure and mountain climbing! So like the good student you are, you attended today's class but didn't understand:( Luckily, you got me, your personal professor. I asked your classmates about today's class and they told me that the professor taught you about Loss Functions, some even told me that he taught them how to climb down from different mountains. Well, grab your hiking gear and follow my lead, we are going to climb down from a high mountain, higher than Everest itself. Do you remember that the objective of training the neural network is to try to minimize the loss between the predictions and the actual values?