Goto

Collaborating Authors

 average weight


When, Where and Why to Average Weights?

Ajroldi, Niccolò, Orvieto, Antonio, Geiping, Jonas

arXiv.org Artificial Intelligence

Averaging checkpoints along the training trajectory is a simple yet powerful approach to improve the generalization performance of Machine Learning models and reduce training time. Motivated by these potential gains, and in an effort to fairly and thoroughly benchmark this technique, we present an extensive evaluation of averaging techniques in modern Deep Learning, which we perform using AlgoPerf \citep{dahl_benchmarking_2023}, a large-scale benchmark for optimization algorithms. We investigate whether weight averaging can reduce training time, improve generalization, and replace learning rate decay, as suggested by recent literature. Our evaluation across seven architectures and datasets reveals that averaging significantly accelerates training and yields considerable efficiency gains, at the price of a minimal implementation and memory cost, while mildly improving generalization across all considered workloads. Finally, we explore the relationship between averaging and learning rate annealing and show how to optimally combine the two to achieve the best performances.


Local Descriptors Weighted Adaptive Threshold Filtering For Few-Shot Learning

Yan, Bingchen

arXiv.org Artificial Intelligence

Few-shot image classification is a challenging task in the field of machine learning, involving the identification of new categories using a limited number of labeled samples. In recent years, methods based on local descriptors have made significant progress in this area. However, the key to improving classification accuracy lies in effectively filtering background noise and accurately selecting critical local descriptors highly relevant to image category information. To address this challenge, we propose an innovative weighted adaptive threshold filtering (WATF) strategy for local descriptors. This strategy can dynamically adjust based on the current task and image context, thereby selecting local descriptors most relevant to the image category. This enables the model to better focus on category-related information while effectively mitigating interference from irrelevant background regions. To evaluate the effectiveness of our method, we adopted the N-way K-shot experimental framework. Experimental results show that our method not only improves the clustering effect of selected local descriptors but also significantly enhances the discriminative ability between image categories. Notably, our method maintains a simple and lightweight design philosophy without introducing additional learnable parameters. This feature ensures consistency in filtering capability during both training and testing phases, further enhancing the reliability and practicality of the method.


The impact of spatio-temporal travel distance on epidemics using an interpretable attention-based sequence-to-sequence model

Jiang, Yukang, Tian, Ting, Xie, Huajun, Guo, Hailiang, Wang, Xueqin

arXiv.org Artificial Intelligence

Amidst the COVID-19 pandemic, travel restrictions have emerged as crucial interventions for mitigating the spread of the virus. In this study, we enhance the predictive capabilities of our model, Sequence-to-Sequence Epidemic Attention Network (S2SEA-Net), by incorporating an attention module, allowing us to assess the impact of distinct classes of travel distances on epidemic dynamics. Furthermore, our model provides forecasts for new confirmed cases and deaths. To achieve this, we leverage daily data on population movement across various travel distance categories, coupled with county-level epidemic data in the United States. Our findings illuminate a compelling relationship between the volume of travelers at different distance ranges and the trajectories of COVID-19. Notably, a discernible spatial pattern emerges with respect to these travel distance categories on a national scale. We unveil the geographical variations in the influence of population movement at different travel distances on the dynamics of epidemic spread. This will contribute to the formulation of strategies for future epidemic prevention and public health policies.


2020 U.S. presidential election in swing states: Gender differences in Twitter conversations

Karami, Amir, Clark, Spring B., Mackenzie, Anderson, Lee, Dorathea, Zhu, Michael, Boyajieff, Hannah R., Goldschmidt, Bailey

arXiv.org Artificial Intelligence

Social media is commonly used by the public during election campaigns to express their opinions regarding different issues. Among various social media channels, Twitter provides an efficient platform for researchers and politicians to explore public opinion regarding a wide range of topics such as the economy and foreign policy. Current literature mainly focuses on analyzing the content of tweets without considering the gender of users. This research collects and analyzes a large number of tweets and uses computational, human coding, and statistical analyses to identify topics in more than 300,000 tweets posted during the 2020 U.S. presidential election and to compare female and male users regarding the average weight of the discussed topics. Our findings are based upon a wide range of topics, such as tax, climate change, and the COVID-19 pandemic. Out of the topics, there exists a significant difference between female and male users for more than 70% of topics.


Hypotheses Testing with SciPy

#artificialintelligence

With a lot of hype going on with the data science field, most of us jump directly into machine learning models and algorithms to make business decisions. All the online courses available fail to teach the very basics of decision making. Hypotheses testing is one of the basic building blocks of decision making and oldest. The earliest use of hypotheses testing was in the 1700s by John Arbuthnot to test whether male and female births are equally likely to occur. In this article, we will be discussing everything about hypotheses testing at the beginner level along with python code making use of the SciPy package.


Statistical Tests in Machine Learning

#artificialintelligence

When it comes to statistics in machine learning, a common approach to accept or reject a null hypothesis is to check for the p-values and give a result without really having an idea of what goes on in the background. Without getting into any kind of fancy jargons or mathematical technicalities, this article attempts to sum up the intuition behind statistics using some real life examples especially for people from a non-statistics background. Why do we need hypothesis testing? But what if suddenly, Dunkin' happens to shut down because Krispe Kreme claims the weight of their donuts is less than what Dunkin' claims. How do we choose sides?


Identifying Animal Growth Using Artificial Intelligence – AI.Business

#artificialintelligence

The use of artificial intelligence has been of enormous economic benefit for dairy farmers in many countries through the improvement of their stock. Affordable tools with the ability to continuously monitor the growth rate of livestock animals are highly sought after by the livestock industries. This demand is driven by the potential for these tools to assist in improving animal welfare and production efficiency. In a rapidly growing population, the demand for meat is escalating, especially in Asia, where the middle class is currently expanding. Meanwhile, in the western world there is growing consumer concern surrounding animal husbandry, with certain organisations labelling some of the current husbandry practices cruel or sub-standard.