slm
SLM: A Smoothed First-Order Lagrangian Method for Structured Constrained Nonconvex Optimization
Functional constrained optimization (FCO) has emerged as a powerful tool for solving various machine learning problems. However, with the rapid increase in applications of neural networks in recent years, it has become apparent that both the objective and constraints often involve nonconvex functions, which poses significant challenges in obtaining high-quality solutions. In this work, we focus on a class of nonconvex FCO problems with nonconvex constraints, where the two optimization variables are nonlinearly coupled in the inequality constraint. Leveraging the primal-dual optimization framework, we propose a smoothed first-order Lagrangian method (SLM) for solving this class of problems. We establish the theoretical convergence guarantees of SLM to the Karush-Kuhn-Tucker (KKT) solutions through quantifying dual error bounds. By establishing connections between this structured FCO and equilibrium-constrained nonconvex problems (also known as bilevel optimization), we apply the proposed SLM to tackle bilevel optimization oriented problems where the lower-level problem is nonconvex. Numerical results obtained from both toy examples and hyper-data cleaning problems demonstrate the superiority of SLM compared to benchmark methods.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
What are small language models and how do they differ from large ones?
What are small language models and how do they differ from large ones? Microsoft recently released its latest small language model that can operate directly on the user's computer. If you haven't followed the AI industry closely, you might be asking: what exactly a small language model (SLM)? As AI becomes increasingly central to how we work, learn and solve problems, understanding the different types of AI models has never been more important. Large language models (LLMs) such as ChatGPT, Claude, Gemini and others are in widespread use.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
- Asia > India (0.05)
SLM: A Smoothed First-Order Lagrangian Method for Structured Constrained Nonconvex Optimization
Functional constrained optimization (FCO) has emerged as a powerful tool for solving various machine learning problems. However, with the rapid increase in applications of neural networks in recent years, it has become apparent that both the objective and constraints often involve nonconvex functions, which poses significant challenges in obtaining high-quality solutions. In this work, we focus on a class of nonconvex FCO problems with nonconvex constraints, where the two optimization variables are nonlinearly coupled in the inequality constraint. Leveraging the primal-dual optimization framework, we propose a smoothed first-order Lagrangian method (SLM) for solving this class of problems. We establish the theoretical convergence guarantees of SLM to the Karush-Kuhn-Tucker (KKT) solutions through quantifying dual error bounds. By establishing connections between this structured FCO and equilibrium-constrained nonconvex problems (also known as bilevel optimization), we apply the proposed SLM to tackle bilevel optimization oriented problems where the lower-level problem is nonconvex. Numerical results obtained from both toy examples and hyper-data cleaning problems demonstrate the superiority of SLM compared to benchmark methods.
StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models
In this paper, we present StyleTTS 2, a text-to-speech (TTS) model that leverages style diffusion and adversarial training with large speech language models (SLMs) to achieve human-level TTS synthesis. StyleTTS 2 differs from its predecessor by modeling styles as a latent random variable through diffusion models to generate the most suitable style for the text without requiring reference speech, achieving efficient latent diffusion while benefiting from the diverse speech synthesis offered by diffusion models. Furthermore, we employ large pre-trained SLMs, such as WavLM, as discriminators with our novel differentiable duration modeling for end-to-end training, resulting in improved speech naturalness. StyleTTS 2 surpasses human recordings on the single-speaker LJSpeech dataset and matches it on the multispeaker VCTK dataset as judged by native English speakers. Moreover, when trained on the LibriTTS dataset, our model outperforms previous publicly available models for zero-shot speaker adaptation. This work achieves the first human-level TTS on both single and multispeaker datasets, showcasing the potential of style diffusion and adversarial training with large SLMs. The audio demos and source code are available at https://styletts2.github.io/.
RaX-Crash: A Resource Efficient and Explainable Small Model Pipeline with an Application to City Scale Injury Severity Prediction
Zhu, Di, Xie, Chen, Wang, Ziwei, Zhang, Haoyun
New York City reports over one hundred thousand motor vehicle collisions each year, creating substantial injury and public health burden. We present RaX-Crash, a resource efficient and explainable small model pipeline for structured injury severity prediction on the official NYC Motor Vehicle Collisions dataset. RaX-Crash integrates three linked tables with tens of millions of records, builds a unified feature schema in partitioned storage, and trains compact tree based ensembles (Random Forest and XGBoost) on engineered tabular features, which are compared against locally deployed small language models (SLMs) prompted with textual summaries. On a temporally held out test set, XGBoost and Random Forest achieve accuracies of 0.7828 and 0.7794, clearly outperforming SLMs (0.594 and 0.496); class imbalance analysis shows that simple class weighting improves fatal recall with modest accuracy trade offs, and SHAP attribution highlights human vulnerability factors, timing, and location as dominant drivers of predicted severity. Overall, RaX-Crash indicates that interpretable small model ensembles remain strong baselines for city scale injury analytics, while hybrid pipelines that pair tabular predictors with SLM generated narratives improve communication without sacrificing scalability.
- North America > United States > New York (0.24)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Health & Medicine (1.00)
- Transportation (0.69)
Bayesian Optimization and Convolutional Neural Networks for Zernike-Based Wavefront Correction in High Harmonic Generation
Fernandes, Guilherme Grancho D., Alexandrino, Duarte, Silva, Eduardo, Matias, João, Pereira, Joaquim
High harmonic generation (HHG) is a nonlinear process that enables table-top generation of tunable, high-energy, coherent, ultrashort radiation pulses in the extreme ultraviolet (EUV) to soft X-ray range. These pulses find applications in photoemission spectroscopy in condensed matter physics, pump-probe spectroscopy for high-energy-density plasmas, and attosecond science. However, optical aberrations in the high-power laser systems required for HHG degrade beam quality and reduce efficiency. W e present a machine learning approach to optimize aberration correction using a spatial light modulator . W e implemented and compared Bayesian optimization and convolutional neural network (CNN) methods to predict optimal Zernike polynomial coefficients for wavefront correction. Our CNN achieved promising results with 80.39% accuracy on test data, demonstrating the potential for automated aberration correction in HHG systems.
David vs. Goliath: Can Small Models Win Big with Agentic AI in Hardware Design?
Shankar, Shashwat, Pandey, Subhranshu, Mochahari, Innocent Dengkhw, Mali, Bhabesh, Chowdhury, Animesh Basak, Bhattacharjee, Sukanta, Karfa, Chandan
Large Language Model(LLM) inference demands massive compute and energy, making domain-specific tasks expensive and unsustainable. As foundation models keep scaling, we ask: Is bigger always better for hardware design? Our work tests this by evaluating Small Language Models coupled with a curated agentic AI framework on NVIDIA's Comprehensive Verilog Design Problems(CVDP) benchmark. Results show that agentic workflows: through task decomposition, iterative feedback, and correction - not only unlock near-LLM performance at a fraction of the cost but also create learning opportunities for agents, paving the way for efficient, adaptive solutions in complex design tasks.
- North America > United States (0.04)
- Asia > India > Assam > Guwahati (0.04)
- Africa > Mali (0.04)
EQ-Negotiator: Dynamic Emotional Personas Empower Small Language Models for Edge-Deployable Credit Negotiation
Long, Yunbo, Liu, Yuhan, Brintrup, Alexandra
The deployment of large language models (LLMs) in automated negotiation has set a high performance benchmark, but their computational cost and data privacy requirements render them unsuitable for many privacy-sensitive, on-device applications such as mobile assistants, embodied AI agents or private client interactions. While small language models (SLMs) offer a practical alternative, they suffer from a significant performance gap compared to LLMs in playing emotionally charged complex personas, especially for credit negotiation. This paper introduces EQ-Negotiator, a novel framework that bridges this capability gap using emotional personas. Its core is a reasoning system that integrates game theory with a Hidden Markov Model(HMM) to learn and track debtor emotional states online, without pre-training. This allows EQ-Negotiator to equip SLMs with the strategic intelligence to counter manipulation while de-escalating conflict and upholding ethical standards. Through extensive agent-to-agent simulations across diverse credit negotiation scenarios, including adversarial debtor strategies like cheating, threatening, and playing the victim, we show that a 7B parameter language model with EQ-Negotiator achieves better debt recovery and negotiation efficiency than baseline LLMs more than 10 times its size. This work advances persona modeling from descriptive character profiles to dynamic emotional architectures that operate within privacy constraints. Besides, this paper establishes that strategic emotional intelligence, not raw model scale, is the critical factor for success in automated negotiation, paving the way for effective, ethical, and privacy-preserving AI negotiators that can operate on the edge.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
Cross-Lingual Interleaving for Speech Language Models
Moumen, Adel, Sun, Guangzhi, Woodland, Philip C.
Spoken Language Models (SLMs) aim to learn linguistic competence directly from speech using discrete units, widening access to Natural Language Processing (NLP) technologies for languages with limited written resources. However, progress has been largely English-centric due to scarce spoken evaluation benchmarks and training data, making cross-lingual learning difficult. We present a cross-lingual interleaving method that mixes speech tokens across languages without textual supervision. We also release an EN-FR training dataset, TinyStories (~42k hours), together with EN-FR spoken StoryCloze and TopicCloze benchmarks for cross-lingual semantic evaluation, both synthetically generated using GPT-4. On 360M and 1B SLMs under matched training-token budgets, interleaving improves monolingual semantic accuracy, enables robust cross-lingual continuation, and strengthens cross-lingual hidden-state alignment. Taken together, these results indicate that cross-lingual interleaving is a simple, scalable route to building multilingual SLMs that understand and converse across languages. All resources will be made open-source to support reproducibility.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.86)
- Europe > Austria > Vienna (0.15)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (6 more...)