electra
Text-Based Approaches to Item Alignment to Content Standards in Large-Scale Reading & Writing Tests
Fu, Yanbin, Jiao, Hong, Zhou, Tianyi, Zhang, Nan, Li, Ming, Xu, Qingshu, Peters, Sydney, Lissitz, Robert W.
Yanbin Fu, Hong Jiao, Tianyi Zhou, Nan Zhang, Ming Li, Qingshu Xu, Sydney Peters, Robert W. Lissitz University of Maryland, College Park Abstract Aligning test items to content standards is a critical step in test development to collect validity evidence based on content. Item alignment has typically been conducted by human experts. This judgmental process can be subjective and time - consuming. This study investigated the performance of fine - tuned small language models (SLMs) for automated item alignment using data from a large - scale standardized reading and writing test for college admissions. Different SLMs were trained for alignment at both domain and skill levels respectively with 10 skills mapped to 4 content domains. The model performance was evaluated in multiple criteria on two testing datasets. The impact of types and sizes of the input data for training was investigated. Results showed that including more item text data led to substantially better model performance, surpassing the improvements induced by sample size inc rease alone. For comparison, supervised machine learning models were trained using the embeddings from the multilingual - E5 - lar ge - instruct model. The study results showed that fine - tuned SLMs consistently outperformed the embedding - based supervised machine learning models, particularly for the more fine - grained skill alignment. To better understand model mis classifications, multiple semantic similarity analysis including pairwise cosine similarity, Kullback - Leibler divergence of embedding distributions, and two - dimension projections of item embeddings were conducted.
Advancing Hate Speech Detection with Transformers: Insights from the MetaHate
Chapagain, Santosh, Hamdi, Shah Muhammad, Boubrahimi, Soukaina Filali
Hate speech is a widespread and harmful form of online discourse, encompassing slurs and defamatory posts that can have serious social, psychological, and sometimes physical impacts on targeted individuals and communities. As social media platforms such as X (formerly Twitter), Facebook, Instagram, Reddit, and others continue to facilitate widespread communication, they also become breeding grounds for hate speech, which has increasingly been linked to real-world hate crimes. Addressing this issue requires the development of robust automated methods to detect hate speech in diverse social media environments. Deep learning approaches, such as vanilla recurrent neural networks (RNNs), long short-term memory (LSTM), and convolutional neural networks (CNNs), have achieved good results, but are often limited by issues such as long-term dependencies and inefficient parallelization. This study represents the comprehensive exploration of transformer-based models for hate speech detection using the MetaHate dataset--a meta-collection of 36 datasets with 1.2 million social media samples. We evaluate multiple state-of-the-art transformer models, including BERT, RoBERTa, GPT-2, and ELECTRA, with fine-tuned ELECTRA achieving the highest performance (F1 score: 0.8980). We also analyze classification errors, revealing challenges with sarcasm, coded language, and label noise.
ELECTRA: A Symmetry-breaking Cartesian Network for Charge Density Prediction with Floating Orbitals
Elsborg, Jonas, Thiede, Luca, Aspuru-Guzik, Alรกn, Vegge, Tejs, Bhowmik, Arghya
We present the Electronic Tensor Reconstruction Algorithm (ELECTRA) - an equivariant model for predicting electronic charge densities using "floating" orbitals. Floating orbitals are a long-standing idea in the quantum chemistry community that promises more compact and accurate representations by placing orbitals freely in space, as opposed to centering all orbitals at the position of atoms. Finding ideal placements of these orbitals requires extensive domain knowledge though, which thus far has prevented widespread adoption. We solve this in a data-driven manner by training a Cartesian tensor network to predict orbital positions along with orbital coefficients. This is made possible through a symmetry-breaking mechanism that is used to learn position displacements with lower symmetry than the input molecule while preserving the rotation equivariance of the charge density itself. Inspired by recent successes of Gaussian Splatting in representing densities in space, we are using Gaussians as our orbitals and predict their weights and covariance matrices. Our method achieves a state-of-the-art balance between computational efficiency and predictive accuracy on established benchmarks.
Exploring the Panorama of Anxiety Levels: A Multi-Scenario Study Based on Human-Centric Anxiety Level Detection and Personalized Guidance
Faculty of Computer Science and Information Technology, University of Malaya, Malaysia Abstract More and more people are under p ressure from work, life and education. Under these pressures, people will develop an anxious state of mind, or even the initial symptoms of suicide. With the advancement of artificial intelligence technology,large language modeling is currently one of the hottest technologies. It is often used for detecting psychological disorders, however, the current study only gives the categorization result, but does not give an interpretable description of what led to this categorization result. Based on all these imma ture studies, this study adopts a person - centered perspective and focuses on GPT - generated multi - scenario simulated conversations. These simulated conversations were selected as data samples for the study. Various transformer - based encoder models were util ized in the study in order to integrate a classification model capable of identifying different anxiety levels. In addition, a knowledge base focusing on anxiety was constructed in this study using Langchain and GPT4. When analyzing the classification resu lts, this knowledge base was able to provide explanations and reasons that were most relevant to the interlocutor's anxiety situation. The study shows that the developed model achieves more than 94% accuracy in categorical prediction and that the advice pr ovided is highly personalized. Mental health is defined as a state of well - being on the mental, emotional, and social levels [8, 16, 34]. Abnormal anxiety is a very important factor that leads to mental health [3, 19, 43].
ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis
Bidirectional transformers excel at sentiment analysis, and Large Language Models (LLM) are effective zero-shot learners. Might they perform better as a team? This paper explores collaborative approaches between ELECTRA and GPT-4o for three-way sentiment classification. We fine-tuned (FT) four models (ELECTRA Base/Large, GPT-4o/4o-mini) using a mix of reviews from Stanford Sentiment Treebank (SST) and DynaSent. We provided input from ELECTRA to GPT as: predicted label, probabilities, and retrieved examples. Sharing ELECTRA Base FT predictions with GPT-4o-mini significantly improved performance over either model alone (82.74 macro F1 vs. 79.29 ELECTRA Base FT, 79.52 GPT-4o-mini) and yielded the lowest cost/performance ratio (\$0.12/F1 point). However, when GPT models were fine-tuned, including predictions decreased performance. GPT-4o FT-M was the top performer (86.99), with GPT-4o-mini FT close behind (86.77) at much less cost (\$0.38 vs. \$1.59/F1 point). Our results show that augmenting prompts with predictions from fine-tuned encoders is an efficient way to boost performance, and a fine-tuned GPT-4o-mini is nearly as good as GPT-4o FT at 76% less cost. Both are affordable options for projects with limited resources.
The Hottest Startups in Paris in 2024
In the past two years the French capital has been in the throes of AI fever and has launched some of Europe's most talked-about startups, including Mistral, which is currently valued at 6.2 billion ( 4.7 billion). That's partly down to the support the industry has received. President Emmanuel Macron has given French AI startups some emphatic political backing, while telecoms billionaire Xavier Niel has provided much investment and will to finance national ambition. In September 2023, Niel invested 200 million ( 212 million), splitting that money between funding for startups such as Mistral, an AI research lab called Kyutai and a cloud supercomputer powered by Nvidia. "I'm the old guy who likes entrepreneurs and the idea was always the same: how we can help this talent to stay here, creating companies," says Niel. Niel, a prolific French businessman who owns telecommunications company Iliad, believes European AI companies now have a unique opportunity to act. "If you want to create a search engine now from scratch, you cannot win because you weren't there 25 years ago.