AITopics | sllm

Collaborating Authors

sllm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ixi-GEN: Efficient Industrial sLLMs through Domain Adaptive Continual Pretraining

Kim, Seonwu, Na, Yohan, Kim, Kihun, Cho, Hanhee, Lim, Geun, Kim, Mintae, Park, Seongik, Kim, Ki Hyun, Han, Youngsub, Jeon, Byoung-Ki

arXiv.org Artificial IntelligenceOct-24-2025

The emergence of open-source large language models (LLMs) has expanded opportunities for enterprise applications; however, many organizations still lack the infrastructure to deploy and maintain large-scale models. As a result, small LLMs (sLLMs) have become a practical alternative despite inherent performance limitations. While Domain Adaptive Continual Pretraining (DACP) has been explored for domain adaptation, its utility in commercial settings remains under-examined. In this study, we validate the effectiveness of a DACP-based recipe across diverse foundation models and service domains, producing DACP-applied sLLMs (ixi-GEN). Through extensive experiments and real-world evaluations, we demonstrate that ixi-GEN models achieve substantial gains in target-domain performance while preserving general capabilities, offering a cost-efficient and scalable solution for enterprise-level deployment.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2507.06795

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Banking & Finance (1.00)
Information Technology (0.93)
Telecommunications (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Dynamic Parameter Memory: Temporary LoRA-Enhanced LLM for Long-Sequence Emotion Recognition in Conversation

Mai, Jialong, Xing, Xiaofen, Li, Yawei, Chen, Weidong, Li, Zhipeng, Xing, Jingyuan, Xu, Xiangmin

arXiv.org Artificial IntelligenceSep-25-2025

Recent research has focused on applying speech large language model (SLLM) to improve speech emotion recognition (SER). However, the inherently high frame rate in speech modality severely limits the signal processing and understanding capabilities of SLLM. For example, a SLLM with a 4K context window can only process 80 seconds of audio at 50Hz feature sampling rate before reaching its capacity limit. Input token compression methods used in SLLM overlook the continuity and inertia of emotions across multiple conversation turns. This paper proposes a Dynamic Parameter Memory (DPM) mechanism with contextual semantics and sentence-level emotion encoding, enabling processing of unlimited-length audio with limited context windows in SLLM. Specifically, DPM progressively encodes sentence-level information and emotions into a temporary LoRA module during inference to effectively "memorize" the contextual information. We trained an emotion SLLM as a backbone and incorporated our DPM into inference for emotion recognition in conversation (ERC). Experimental results on the IEMOCAP and MELD datasets show that DPM significantly improves the emotion recognition capabilities of SLLM when processing long audio sequences, achieving state-of-the-art performance.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.09076

Country: Asia > China (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (1.00)

Add feedback

EmoSLLM: Parameter-Efficient Adaptation of LLMs for Speech Emotion Recognition

Thimonier, Hugo, Perzo, Antony, Seguier, Renaud

arXiv.org Artificial IntelligenceAug-21-2025

Emotion recognition from speech is a challenging task that requires capturing both linguistic and paralinguistic cues, with critical applications in human-computer interaction and mental health monitoring. Recent works have highlighted the ability of Large Language Models (LLMs) to perform tasks outside of the sole natural language area. In particular, recent approaches have investigated coupling LLMs with other data modalities by using pre-trained backbones and different fusion mechanisms. This work proposes a novel approach that fine-tunes an LLM with audio and text representations for emotion prediction. Our method first extracts audio features using an audio feature extractor, which are then mapped into the LLM's representation space via a learnable interfacing module. The LLM takes as input (1) the transformed audio features, (2) additional features in the form of natural language (e.g., the transcript), and (3) a textual prompt describing the emotion prediction task. To efficiently adapt the LLM to this multimodal task, we employ Low-Rank Adaptation (LoRA), enabling parameter-efficient fine-tuning. Experimental results on standard emotion recognition benchmarks demonstrate that our model outperforms all but one existing Speech-Text LLMs in the literature, while requiring less than half the parameters of competing approaches. This highlights our approach's effectiveness in integrating multi-modal inputs for speech-based emotion understanding while maintaining significant computational efficiency.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2508.1413

Country:

North America > United States (0.28)
Europe > France (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Enhancing Generalization of Speech Large Language Models with Multi-Task Behavior Imitation and Speech-Text Interleaving

Xie, Jingran, Li, Xiang, Wang, Hui, Yu, Yue, Xiang, Yang, Wu, Xixin, Wu, Zhiyong

arXiv.org Artificial IntelligenceMay-27-2025

Large language models (LLMs) have shown remarkable generalization across tasks, leading to increased interest in integrating speech with LLMs. These speech LLMs (SLLMs) typically use supervised fine-tuning to align speech with text-based LLMs. However, the lack of annotated speech data across a wide range of tasks hinders alignment efficiency, resulting in poor generalization. To address these issues, we propose a novel multi-task 'behavior imitation' method with speech-text interleaving, called MTBI, which relies solely on paired speech and transcripts. By ensuring the LLM decoder generates equivalent responses to paired speech and text, we achieve a more generalized SLLM. Interleaving is used to further enhance alignment efficiency. We introduce a simple benchmark to evaluate prompt and task generalization across different models. Experimental results demonstrate that our MTBI outperforms SOTA SLLMs on both prompt and task generalization, while requiring less supervised speech data.

arxiv preprint arxiv, large language model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2505.18644

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

ABHINAYA -- A System for Speech Emotion Recognition In Naturalistic Conditions Challenge

Dutta, Soumya, Balaji, Smruthi, R, Varada, Salinamakki, Viveka, Ganapathy, Sriram

arXiv.org Artificial IntelligenceMay-27-2025

Speech emotion recognition (SER) in naturalistic settings remains a challenge due to the intrinsic variability, diverse recording conditions, and class imbalance. As participants in the In-terspeech Naturalistic SER Challenge which focused on these complexities, we present "Abhinaya", a system integrating speech-based, text-based, and speech-text models. Our approach fine-tunes self-supervised and speech large language models (SLLM) for speech representations, leverages large language models (LLM) for textual context, and employs speech-text modeling with an SLLM to capture nuanced emotional cues. To combat class imbalance, we apply tailored loss functions and generate categorical decisions through majority voting. Despite one model not being fully trained, the Abhinaya system ranked 4 th among 166 submissions. Upon completion of training, it achieved state-of-the-art performance among published results, demonstrating the effectiveness of our approach for SER in real-world conditions.

large language model, machine learning, recognition, (14 more...)

arXiv.org Artificial Intelligence

2505.18217

Country:

Asia > India (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Goal2Story: A Multi-Agent Fleet based on Privately Enabled sLLMs for Impacting Mapping on Requirements Elicitation

Zou, Xinkai, Liu, Yan, Shi, Xiongbo, Yang, Chen

arXiv.org Artificial IntelligenceMar-17-2025

As requirements drift with rapid iterations, agile development becomes the dominant paradigm. Goal-driven Requirements Elicitation (RE) is a pivotal yet challenging task in agile project development due to its heavy tangling with adaptive planning and efficient collaboration. Recently, AI agents have shown promising ability in supporting requirements analysis by saving significant time and effort for stakeholders. However, current research mainly focuses on functional RE, and research works have not been reported bridging the long journey from goal to user stories. Moreover, considering the cost of LLM facilities and the need for data and idea protection, privately hosted small-sized LLM should be further utilized in RE. To address these challenges, we propose Goal2Story, a multi-agent fleet that adopts the Impact Mapping (IM) framework while merely using cost-effective sLLMs for goal-driven RE. Moreover, we introduce a StorySeek dataset that contains over 1,000 user stories (USs) with corresponding goals and project context information, as well as the semi-automatic dataset construction method. For evaluation, we proposed two metrics: Factuality Hit Rate (FHR) to measure consistency between the generated USs with the dataset and Quality And Consistency Evaluation (QuACE) to evaluate the quality of the generated USs. Experimental results demonstrate that Goal2Story outperforms the baseline performance of the Super-Agent adopting powerful LLMs, while also showcasing the performance improvements in key metrics brought by CoT and Agent Profile to Goal2Story, as well as its exploration in identifying latent needs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.13279

Country:

North America > United States (0.14)
Asia > China > Shanghai > Shanghai (0.05)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OnionEval: An Unified Evaluation of Fact-conflicting Hallucination for Small-Large Language Models

Sun, Chongren, Li, Yuran, Wu, Di, Boulet, Benoit

arXiv.org Artificial IntelligenceJan-22-2025

Large Language Models (LLMs) are highly capable but require significant computational resources for both training and inference. Within the LLM family, smaller models (those with fewer than 10 billion parameters) also perform well across various tasks. However, these smaller models share similar limitations to their larger counterparts, including the tendency to hallucinate. Despite the existence of many benchmarks to evaluate hallucination in LLMs, few have specifically focused on small LLMs (SLLMs). Additionally, SLLMs show widely varying performance across different benchmarks. In this paper, we introduce OnionEval, a multi-layer structured framework with a specific metric called the context-influence score (CI), designed to effectively assess the fact-conflicting hallucination tendencies of small LLMs across different contextual levels. Our experimental results reveal a key feature of SLLMs: they excel in factual analysis but face challenges with context reasoning. Further investigation shows that a simple Chain-of-Thought strategy can significantly reduce these limitations, improving the practical usefulness of SLLMs in real-world applications.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2501.12975

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Italy > Veneto (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Multilingual Mathematical Reasoning: Advancing Open-Source LLMs in Hindi and English

Anand, Avinash, Prasad, Kritarth, Kirtani, Chhavi, Nair, Ashwin R, Nema, Manvendra Kumar, Jaiswal, Raj, Shah, Rajiv Ratn

arXiv.org Artificial IntelligenceDec-24-2024

Large Language Models (LLMs) excel in linguistic tasks but struggle with mathematical reasoning, particularly in non English languages like Hindi. This research aims to enhance the mathematical reasoning skills of smaller, resource efficient open-source LLMs in both Hindi and English. We evaluate models like OpenHathi 7B, LLaMA-2 7B, WizardMath 7B, Mistral 7B, LLeMMa 7B, MAmmoTH 7B, Gemini Pro, and GPT-4 using zero-shot, few-shot chain-of-thought (CoT) methods, and supervised fine-tuning. Our approach incorporates curriculum learning, progressively training models on increasingly difficult problems, a novel Decomposition Strategy to simplify complex arithmetic operations, and a Structured Solution Design that divides solutions into phases. Our experiments result in notable performance enhancements. WizardMath 7B exceeds Gemini's accuracy on English datasets by +6% and matches Gemini's performance on Hindi datasets. Adopting a bilingual approach that combines English and Hindi samples achieves results comparable to individual language models, demonstrating the capability to learn mathematical reasoning in both languages. This research highlights the potential for improving mathematical reasoning in open-source LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2412.18415

Country:

Asia > Singapore (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Switzerland (0.04)
(3 more...)

Genre: Research Report > New Finding (0.93)

Industry: Education > Educational Setting > K-12 Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SentenceVAE: Enable Next-sentence Prediction for Large Language Models with Faster Speed, Higher Accuracy and Longer Context

An, Hongjun, Chen, Yifan, Sun, Zhe, Li, Xuelong

arXiv.org Artificial IntelligenceAug-14-2024

Current large language models (LLMs) primarily utilize next-token prediction method for inference, which significantly impedes their processing speed. In this paper, we introduce a novel inference methodology termed next-sentence prediction, aiming at enhancing the inference efficiency of LLMs. We present Sentence Variational Autoencoder (SentenceVAE), which includes a Sentence Encoder to compress multiple tokens in a sentence into a single token, and a Sentence Decoder to reconstruct it. By integrating SentenceVAE into the input and output layers of LLMs, we develop Sentence-level LLMs (SLLMs) that employ a sentence-by-sentence inference method. In addition, the SentenceVAE module of SLLMs can maintain the integrity of the original semantic content by segmenting the context into sentences, thereby improving accuracy while boosting inference speed. Moreover, compared to previous LLMs, SLLMs process fewer tokens over equivalent context length, significantly reducing memory demands for self-attention computation and facilitating the handling of longer context. Extensive experiments on Wanjuan dataset have revealed that the proposed method can accelerate inference speed by 204~365%, reduce perplexity (PPL) to 46~75% of its original metric, and decrease memory overhead by 86~91% for the equivalent context length, compared to previous token-by-token methods.

arxiv preprint arxiv, llm, sentencevae, (15 more...)

arXiv.org Artificial Intelligence

2408.00655

Country:

Asia > Singapore (0.04)
Asia > Indonesia > Bali (0.04)
Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models

Lin, Yi-Cheng, Chen, Wei-Chih, Lee, Hung-yi

arXiv.org Artificial IntelligenceAug-14-2024

Warning: This paper may contain texts with uncomfortable content. Large Language Models (LLMs) have achieved remarkable performance in various tasks, including those involving multimodal data like speech. However, these models often exhibit biases due to the nature of their training data. Recently, more Speech Large Language Models (SLLMs) have emerged, underscoring the urgent need to address these biases. This study introduces Spoken Stereoset, a dataset specifically designed to evaluate social biases in SLLMs. By examining how different models respond to speech from diverse demographic groups, we aim to identify these biases. Our experiments reveal significant insights into their performance and bias levels. The findings indicate that while most models show minimal bias, some still exhibit slightly stereotypical or anti-stereotypical tendencies.

continuation, dataset, language model, (10 more...)

arXiv.org Artificial Intelligence

2408.07665

Country:

Asia > Taiwan (0.04)
Africa > Eswatini > Manzini > Manzini (0.04)
North America > United States > Texas (0.04)
Africa > Guinea > Kankan Region > Kankan Prefecture > Kankan (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.46)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.96)

Add feedback