AITopics | gemma

Collaborating Authors

gemma

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Analysing Moral Bias in Finetuned LLMs through Mechanistic Interpretability

Raimondi, Bianca, Dalbagno, Daniela, Gabbrielli, Maurizio

arXiv.org Artificial IntelligenceDec-8-2025

Large language models (LLMs) have been shown to internalize human-like biases during finetuning, yet the mechanisms by which these biases manifest remain unclear. In this work, we investigated whether the well-known Knobe effect, a moral bias in intentionality judgements, emerges in finetuned LLMs and whether it can be traced back to specific components of the model. We conducted a Layer-Patching analysis across 3 open-weights LLMs and demonstrated that the bias is not only learned during finetuning but also localized in a specific set of layers. Surprisingly, we found that patching activations from the corresponding pretrained model into just a few critical layers is sufficient to eliminate the effect. Our findings offer new evidence that social biases in LLMs can be interpreted, localized, and mitigated through targeted interventions, without the need for model retraining.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2510.12229

Country:

Europe > Italy (0.14)
North America > United States (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Mind Reading or Misreading? LLMs on the Big Five Personality Test

Di Cursi, Francesco, Boldrini, Chiara, Conti, Marco, Passarella, Andrea

arXiv.org Artificial IntelligenceDec-1-2025

We evaluate large language models (LLMs) for automatic personality prediction from text under the binary Five Factor Model (BIG5). Five models -- including GPT-4 and lightweight open-source alternatives -- are tested across three heterogeneous datasets (Essays, MyPersonality, Pandora) and two prompting strategies (minimal vs. enriched with linguistic and psychological cues). Enriched prompts reduce invalid outputs and improve class balance, but also introduce a systematic bias toward predicting trait presence. Performance varies substantially: Openness and Agreeableness are relatively easier to detect, while Extraversion and Neuroticism remain challenging. Although open-source models sometimes approach GPT-4 and prior benchmarks, no configuration yields consistently reliable predictions in zero-shot binary settings. Moreover, aggregate metrics such as accuracy and macro-F1 mask significant asymmetries, with per-class recall offering clearer diagnostic value. These findings show that current out-of-the-box LLMs are not yet suitable for APPT, and that careful coordination of prompt design, trait framing, and evaluation metrics is essential for interpretable results.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2511.23101

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Breast Cancer Prediction with LLM-Inferred Confounders

Roy, Debmita

arXiv.org Artificial IntelligenceNov-25-2025

Wheeler High School, Marietta, GA Abstract This study enhances breast cancer prediction by using large language models to infer the likelihood of confounding diseases, namely diabetes, obesity, and cardiovascular disease, from routine clinical data. These AI-generated features improved Random Forest model performance, particularly for LLMs like Gemma (3.9%) and Llama (6.4%). The approach shows promise for noninvasive prescreening and clinical integration, supporting improved early detection and shared decision-making in breast cancer diagnosis. Introduction Breast cancer (BC) is a leading cause of death among women in the U.S., with most cases having unknown causes despite known risk factors1. Researchers have identified correlations between BC and various clinical features and biomarkers, such as body mass index, glucose, insulin, leptin, adiponectin, resistin, MCP-1, and HOMA, that can be measured through routine blood tests.

large language model, likelihood, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2511.17662

Country: North America > United States > Georgia > Cobb County > Marietta (0.25)

Genre: Research Report > Experimental Study (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Balancing Natural Language Processing Accuracy and Normalisation in Extracting Medical Insights

Tworek, Paulina, Bargieł, Miłosz, Khan, Yousef, Pełech-Pilichowski, Tomasz, Mikołajczyk, Marek, Lewandowski, Roman, Sousa, Jose

arXiv.org Artificial IntelligenceNov-21-2025

Extracting structured medical insights from unstructured clinical text using Natural Language Processing (NLP) remains an open challenge in healthcare, particularly in non-English contexts where resources are scarce. This study presents a comparative analysis of NLP low-compute rule-based methods and Large Language Models (LLMs) for information extraction from electronic health records (EHR) obtained from the Voivodeship Rehabilitation Hospital for Children in Ameryka, Poland. We evaluate both approaches by extracting patient demographics, clinical findings, and prescribed medications while examining the effects of lack of text normalisation and translation-induced information loss. Results demonstrate that rule-based methods provide higher accuracy in information retrieval tasks, particularly for age and sex extraction. However, LLMs offer greater adaptability and scalability, excelling in drug name recognition. The effectiveness of the LLMs was compared with texts originally in Polish and those translated into English, assessing the impact of translation. These findings highlight the trade-offs between accuracy, normalisation, and computational cost when deploying NLP in healthcare settings. We argue for hybrid approaches that combine the precision of rule-based systems with the adaptability of LLMs, offering a practical path toward more reliable and resource-efficient clinical NLP in real-world hospitals.

information, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2511.15778

Country: Europe > Poland (0.50)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.87)
Health & Medicine > Health Care Providers & Services (0.69)
Health & Medicine > Diagnostic Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unveiling Intrinsic Dimension of Texts: from Academic Abstract to Creative Story

Pedashenko, Vladislav, Kushnareva, Laida, Nibal, Yana Khassan, Tulchinskii, Eduard, Kuznetsov, Kristian, Zharchinskii, Vladislav, Maximov, Yury, Piontkovskaya, Irina

arXiv.org Artificial IntelligenceNov-20-2025

Intrinsic dimension (ID) is an important tool in modern LLM analysis, informing studies of training dynamics, scaling behavior, and dataset structure, yet its textual determinants remain underexplored. We provide the first comprehensive study grounding ID in interpretable text properties through cross-encoder analysis, linguistic features, and sparse autoencoders (SAEs). In this work, we establish three key findings. First, ID is complementary to entropy-based metrics: after controlling for length, the two are uncorrelated, with ID capturing geometric complexity orthogonal to prediction quality. Second, ID exhibits robust genre stratification: scientific prose shows low ID (~8), encyclopedic content medium ID (~9), and creative/opinion writing high ID (~10.5) across all models tested. This reveals that contemporary LLMs find scientific text "representationally simple" while fiction requires additional degrees of freedom. Third, using SAEs, we identify causal features: scientific signals (formal tone, report templates, statistics) reduce ID; humanized signals (personalization, emotion, narrative) increase it. Steering experiments confirm these effects are causal. Thus, for contemporary models, scientific writing appears comparatively "easy", whereas fiction, opinion, and affect add representational degrees of freedom. Our multi-faceted analysis provides practical guidance for the proper use of ID and the sound interpretation of ID-based results.

dimension, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.1521

Country:

North America > United States (0.46)
Europe > Austria (0.28)
Asia > Middle East > UAE (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Preference Learning from Physics-Based Feedback: Tuning Language Models to Design BCC/B2 Superalloys

Ghosh, Satanu, Holgate, Collin, Brodnik, Neal R., Downey, Doug, Daly, Samantha, Pollock, Tresa M., Carton, Samuel

arXiv.org Artificial IntelligenceNov-18-2025

We apply preference learning to the task of language model-guided design of novel structural alloys. In contrast to prior work that focuses on generating stable inorganic crystals, our approach targets the synthesizeability of a specific structural class: BCC/B2 superalloys, an underexplored family of materials with potential applications in extreme environments. Using three open-weight models (LLaMA-3.1, Gemma-2, and OLMo-2), we demonstrate that language models can be optimized for multiple design objectives using a single, unified reward signal through Direct Preference Optimization (DPO). Unlike prior approaches that rely on heuristic or human-in-the-loop feedback (costly), our reward signal is derived from thermodynamic phase calculations, offering a scientifically grounded criterion for model tuning. To our knowledge, this is the first demonstration of preference-tuning a language model using physics-grounded feedback for structural alloy design. The resulting framework is general and extensible, providing a path forward for intelligent design-space exploration across a range of physical science domains.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.12036

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (0.67)
Materials (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Beyond the Surface: Probing the Ideological Depth of Large Language Models

Kabir, Shariar, Esterling, Kevin, Dong, Yue

arXiv.org Artificial IntelligenceNov-17-2025

Large language models (LLMs) display recognizable political leanings, yet they vary significantly in their ability to represent a political orientation consistently. In this paper, we define ideological depth as (i) a model's ability to follow political instructions without failure (steerability), and (ii) the feature richness of its internal political representations measured with sparse autoencoders (SAEs), an unsupervised sparse dictionary learning (SDL) approach. Using Llama-3.1-8B-Instruct and Gemma-2-9B-IT as candidates, we compare prompt-based and activation-steering interventions and probe political features with publicly available SAEs. We find large, systematic differences: Gemma is more steerable in both directions and activates approximately 7.3x more distinct political features than Llama. Furthermore, causal ablations of a small targeted set of Gemma's political features to create a similar feature-poor setting induce consistent shifts in its behavior, with increased rates of refusals across topics. Together, these results indicate that refusals on benign political instructions or prompts can arise from capability deficits rather than safety guardrails. Ideological depth thus emerges as a measurable property of LLMs, and steerability serves as a window into their latent political architecture.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2508.21448

Country:

Asia (0.46)
North America > United States (0.28)

Genre: Research Report > New Finding (0.68)

Industry:

Law (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

XBreaking: Understanding how LLMs security alignment can be broken

Arazzi, Marco, Kembu, Vignesh Kumar, Nocera, Antonino, P, Vinod

arXiv.org Artificial IntelligenceNov-10-2025

Abstract--Large Language Models are fundamental actors in the modern IT landscape dominated by AI solutions. However, security threats associated with them might prevent their reliable adoption in critical application scenarios such as government organizations and medical institutions. For this reason, commercial LLMs typically undergo a sophisticated censoring mechanism to eliminate any harmful output they could possibly produce. These mechanisms maintain the integrity of LLM alignment by guaranteeing that the models respond safely and ethically. In response to this, attacks on LLMs are a significant threat to such protections, and many previous approaches have already demonstrated their effectiveness across diverse domains. Existing LLM attacks mostly adopt a generate-and-test strategy to craft malicious input. T o improve the comprehension of censoring mechanisms and design a targeted attack, we propose an Explainable-AI solution that comparatively analyzes the behavior of censored and uncensored models to derive unique exploitable alignment patterns. Then, we propose XBreaking, a novel approach that exploits these unique patterns to break the security and alignment constraints of LLMs by targeted noise injection. Our thorough experimental campaign returns important insights about the censoring mechanisms and demonstrates the effectiveness and performance of our approach. Nowadays, Large Language Models (LLMs, for short) represent the most promising and relevant advancement in the field of Artificial Intelligence. These complex deep learning models are trained on massive datasets that cover almost all aspects of people's daily lives, thus granting them the capability of generating, understanding, and processing human language. For this reason, their integration as support tools is becoming pervasive with applications spanning from text editor and proofreading to virtual assistant and personalized text generation. However, the diffusion of this technology, especially in critical domains such as government organizations and medical institutions, imposes the assessment of their security and privacy characteristics.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2504.217

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.93)

Industry:

Law > Civil Rights & Constitutional Law (1.00)
Information Technology > Security & Privacy (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GEMMA-SQL: A Novel Text-to-SQL Model Based on Large Language Models

Pandey, Hari Mohan, Gupta, Anshul, Sarkar, Subham, Tomer, Minakshi, Johannes, Schneider, Gong, Yan

arXiv.org Artificial IntelligenceNov-10-2025

Text-to-SQL systems enable users to interact with structured databases using natural language, eliminating the need for specialized programming knowledge. In this work, we introduce GEMMA-SQL, a lightweight and efficient text-to-SQL model built upon the open-source Gemma 2B architecture. Unlike many large language models (LLMs), GEMMA-SQL is fine-tuned in a resource-efficient, iterative manner and can be deployed on low-cost hardware. Leveraging the SPIDER benchmark for training and evaluation, GEMMA-SQL combines multiple prompting strategies, including few-shot learning, to enhance SQL query generation accuracy. The instruction-tuned variant, GEMMA-SQL Instruct, achieves 66.8% Test-Suite accuracy and 63.3% Exact Set Match accuracy, outperforming several state-of-the-art baselines such as IRNet, RYANSQL, and CodeXDavinci. The proposed approach demonstrates that effective prompt design and targeted instruction tuning can significantly boost performance while maintaining high scalability and adaptability. These results position GEMMA-SQL as a practical, open-source alternative for robust and accessible text-to-SQL systems.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2511.0471

Country:

Europe (0.46)
Asia > India (0.28)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficiency vs. Alignment: Investigating Safety and Fairness Risks in Parameter-Efficient Fine-Tuning of LLMs

Taraghi, Mina, Pequignot, Yann, Nikanjam, Amin, Merzouk, Mohamed Amine, Khomh, Foutse

arXiv.org Artificial IntelligenceNov-4-2025

Organizations are increasingly adopting and adapting Large Language Models (LLMs) hosted on public repositories such as HuggingFace. Although these adaptations often improve performance on specialized downstream tasks, recent evidence indicates that they can also degrade a model's safety or fairness. Since different fine-tuning techniques may exert distinct effects on these critical dimensions, this study undertakes a systematic assessment of their trade-offs. Four widely used Parameter-Efficient Fine-Tuning methods, LoRA, IA3, Prompt-Tuning, and P-Tuning, are applied to four instruction-tuned model families (Meta-Llama-3-8B, Qwen2.5-7B, Mistral-7B, and Gemma-7B). In total, 235 fine-tuned variants are evaluated across eleven safety hazard categories and nine demographic fairness dimensions. The results show that adapter-based approaches (LoRA, IA3) tend to improve safety scores and are the least disruptive to fairness, retaining higher accuracy and lower bias scores. In contrast, prompt-based methods (Prompt-Tuning and P-Tuning) generally reduce safety and cause larger fairness regressions, with decreased accuracy and increased bias. Alignment shifts are strongly moderated by base model type: LLaMA remains stable, Qwen records modest gains, Gemma experiences the steepest safety decline, and Mistral, which is released without an internal moderation layer, displays the greatest variance. Improvements in safety do not necessarily translate into improvements in fairness, and no single configuration optimizes all fairness metrics simultaneously, indicating an inherent trade-off between these objectives. These findings suggest a practical guideline for safety-critical deployments: begin with a well-aligned base model, favour adapter-based PEFT, and conduct category-specific audits of both safety and fairness.

category, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2511.00382

Country: North America > Canada > Quebec (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback