AITopics | electra

Collaborating Authors

electra

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

c3a690be93aa602ee2dc0ccab5b7b67e-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-10-2026, 05:05:18 GMT

mnli-initialization, suggestion, training efficiency, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.35)

Add feedback

Text-Based Approaches to Item Alignment to Content Standards in Large-Scale Reading & Writing Tests

Fu, Yanbin, Jiao, Hong, Zhou, Tianyi, Zhang, Nan, Li, Ming, Xu, Qingshu, Peters, Sydney, Lissitz, Robert W.

arXiv.org Artificial IntelligenceOct-14-2025

Yanbin Fu, Hong Jiao, Tianyi Zhou, Nan Zhang, Ming Li, Qingshu Xu, Sydney Peters, Robert W. Lissitz University of Maryland, College Park Abstract Aligning test items to content standards is a critical step in test development to collect validity evidence based on content. Item alignment has typically been conducted by human experts. This judgmental process can be subjective and time - consuming. This study investigated the performance of fine - tuned small language models (SLMs) for automated item alignment using data from a large - scale standardized reading and writing test for college admissions. Different SLMs were trained for alignment at both domain and skill levels respectively with 10 skills mapped to 4 content domains. The model performance was evaluated in multiple criteria on two testing datasets. The impact of types and sizes of the input data for training was investigated. Results showed that including more item text data led to substantially better model performance, surpassing the improvements induced by sample size inc rease alone. For comparison, supervised machine learning models were trained using the embeddings from the multilingual - E5 - lar ge - instruct model. The study results showed that fine - tuned SLMs consistently outperformed the embedding - based supervised machine learning models, particularly for the more fine - grained skill alignment. To better understand model mis classifications, multiple semantic similarity analysis including pairwise cosine similarity, Kullback - Leibler divergence of embedding distributions, and two - dimension projections of item embeddings were conducted.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2509.26431

Country: North America > United States > Maryland > Prince George's County > College Park (0.24)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Assessment & Standards (1.00)
Education > Educational Setting (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

To Reviewer # 1 2 C1: Misleading comparisons to ELECTRA in RTE, STS-B and MRPC

Neural Information Processing SystemsAug-16-2025, 07:08:33 GMT

Thank all reviewers for the valuable comments and suggestions. Please find responses (R) to specific comments (C). MNLI-initialization to make a fair comparison with ELECTRA. The results are shown in Table 1. We will add this comparison into our paper in the next version.

misleading comparison, reviewer, sts-b and mrpc, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.35)

Add feedback

Advancing Hate Speech Detection with Transformers: Insights from the MetaHate

Chapagain, Santosh, Hamdi, Shah Muhammad, Boubrahimi, Soukaina Filali

arXiv.org Artificial IntelligenceAug-8-2025

Hate speech is a widespread and harmful form of online discourse, encompassing slurs and defamatory posts that can have serious social, psychological, and sometimes physical impacts on targeted individuals and communities. As social media platforms such as X (formerly Twitter), Facebook, Instagram, Reddit, and others continue to facilitate widespread communication, they also become breeding grounds for hate speech, which has increasingly been linked to real-world hate crimes. Addressing this issue requires the development of robust automated methods to detect hate speech in diverse social media environments. Deep learning approaches, such as vanilla recurrent neural networks (RNNs), long short-term memory (LSTM), and convolutional neural networks (CNNs), have achieved good results, but are often limited by issues such as long-term dependencies and inefficient parallelization. This study represents the comprehensive exploration of transformer-based models for hate speech detection using the MetaHate dataset--a meta-collection of 36 datasets with 1.2 million social media samples. We evaluate multiple state-of-the-art transformer models, including BERT, RoBERTa, GPT-2, and ELECTRA, with fine-tuned ELECTRA achieving the highest performance (F1 score: 0.8980). We also analyze classification errors, revealing challenges with sarcasm, coded language, and label noise.

artificial intelligence, machine learning, social media, (16 more...)

arXiv.org Artificial Intelligence

2508.04913

Country: North America > United States > Utah (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology (0.70)
Law Enforcement & Public Safety > Terrorism (0.35)
Media > News (0.35)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ELECTRA: A Symmetry-breaking Cartesian Network for Charge Density Prediction with Floating Orbitals

Elsborg, Jonas, Thiede, Luca, Aspuru-Guzik, Alán, Vegge, Tejs, Bhowmik, Arghya

arXiv.org Artificial IntelligenceMar-11-2025

We present the Electronic Tensor Reconstruction Algorithm (ELECTRA) - an equivariant model for predicting electronic charge densities using "floating" orbitals. Floating orbitals are a long-standing idea in the quantum chemistry community that promises more compact and accurate representations by placing orbitals freely in space, as opposed to centering all orbitals at the position of atoms. Finding ideal placements of these orbitals requires extensive domain knowledge though, which thus far has prevented widespread adoption. We solve this in a data-driven manner by training a Cartesian tensor network to predict orbital positions along with orbital coefficients. This is made possible through a symmetry-breaking mechanism that is used to learn position displacements with lower symmetry than the input molecule while preserving the rotation equivariance of the charge density itself. Inspired by recent successes of Gaussian Splatting in representing densities in space, we are using Gaussians as our orbitals and predict their weights and covariance matrices. Our method achieves a state-of-the-art balance between computational efficiency and predictive accuracy on established benchmarks.

electra, gaussian, symmetry-breaking cartesian network, (13 more...)

arXiv.org Artificial Intelligence

2503.08305

Country:

North America > Canada > Ontario > Toronto (0.14)
Europe > Denmark > Capital Region > Kongens Lyngby (0.04)

Genre: Research Report (0.40)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Exploring the Panorama of Anxiety Levels: A Multi-Scenario Study Based on Human-Centric Anxiety Level Detection and Personalized Guidance

Xian, Longdi, Xu, Junhao

arXiv.org Artificial IntelligenceFeb-3-2025

Faculty of Computer Science and Information Technology, University of Malaya, Malaysia Abstract More and more people are under p ressure from work, life and education. Under these pressures, people will develop an anxious state of mind, or even the initial symptoms of suicide. With the advancement of artificial intelligence technology,large language modeling is currently one of the hottest technologies. It is often used for detecting psychological disorders, however, the current study only gives the categorization result, but does not give an interpretable description of what led to this categorization result. Based on all these imma ture studies, this study adopts a person - centered perspective and focuses on GPT - generated multi - scenario simulated conversations. These simulated conversations were selected as data samples for the study. Various transformer - based encoder models were util ized in the study in order to integrate a classification model capable of identifying different anxiety levels. In addition, a knowledge base focusing on anxiety was constructed in this study using Langchain and GPT4. When analyzing the classification resu lts, this knowledge base was able to provide explanations and reasons that were most relevant to the interlocutor's anxiety situation. The study shows that the developed model achieves more than 94% accuracy in categorical prediction and that the advice pr ovided is highly personalized. Mental health is defined as a state of well - being on the mental, emotional, and social levels [8, 16, 34]. Abnormal anxiety is a very important factor that leads to mental health [3, 19, 43].

anxiety level, scenario, vector, (14 more...)

arXiv.org Artificial Intelligence

2503.15527

Country:

Asia > Malaysia (0.54)
Africa > Nigeria (0.04)
North America > United States > Michigan (0.04)
(9 more...)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis

Beno, James P.

arXiv.org Artificial IntelligenceDec-29-2024

Bidirectional transformers excel at sentiment analysis, and Large Language Models (LLM) are effective zero-shot learners. Might they perform better as a team? This paper explores collaborative approaches between ELECTRA and GPT-4o for three-way sentiment classification. We fine-tuned (FT) four models (ELECTRA Base/Large, GPT-4o/4o-mini) using a mix of reviews from Stanford Sentiment Treebank (SST) and DynaSent. We provided input from ELECTRA to GPT as: predicted label, probabilities, and retrieved examples. Sharing ELECTRA Base FT predictions with GPT-4o-mini significantly improved performance over either model alone (82.74 macro F1 vs. 79.29 ELECTRA Base FT, 79.52 GPT-4o-mini) and yielded the lowest cost/performance ratio (\$0.12/F1 point). However, when GPT models were fine-tuned, including predictions decreased performance. GPT-4o FT-M was the top performer (86.99), with GPT-4o-mini FT close behind (86.77) at much less cost (\$0.38 vs. \$1.59/F1 point). Our results show that augmenting prompts with predictions from fine-tuned encoders is an efficient way to boost performance, and a fine-tuned GPT-4o-mini is nearly as good as GPT-4o FT at 76% less cost. Both are affordable options for projects with limited resources.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.00062

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Hottest Startups in Paris in 2024

WIREDOct-14-2024, 07:00:00 GMT

In the past two years the French capital has been in the throes of AI fever and has launched some of Europe's most talked-about startups, including Mistral, which is currently valued at 6.2 billion ( 4.7 billion). That's partly down to the support the industry has received. President Emmanuel Macron has given French AI startups some emphatic political backing, while telecoms billionaire Xavier Niel has provided much investment and will to finance national ambition. In September 2023, Niel invested 200 million ( 212 million), splitting that money between funding for startups such as Mistral, an AI research lab called Kyutai and a cloud supercomputer powered by Nvidia. "I'm the old guy who likes entrepreneurs and the idea was always the same: how we can help this talent to stay here, creating companies," says Niel. Niel, a prolific French businessman who owns telecommunications company Iliad, believes European AI companies now have a unique opportunity to act. "If you want to create a search engine now from scratch, you cannot win because you weren't there 25 years ago.

co-founder, mistral, startup, (13 more...)

WIRED

Country: