South America
Open Deep Search: Democratizing Search with Open-source Reasoning Agents
Alzubi, Salaheddin, Brooks, Creston, Chiniya, Purva, Contente, Edoardo, von Gerlach, Chiara, Irwin, Lucas, Jiang, Yihan, Kaz, Arda, Nguyen, Windsor, Oh, Sewoong, Tyagi, Himanshu, Viswanath, Pramod
We introduce Open Deep Search (ODS) to close the increasing gap between the proprietary search AI solutions, such as Perplexity's Sonar Reasoning Pro and OpenAI's GPT-4o Search Preview, and their open-source counterparts. The main innovation introduced in ODS is to augment the reasoning capabilities of the latest open-source LLMs with reasoning agents that can judiciously use web search tools to answer queries. Concretely, ODS consists of two components that work with a base LLM chosen by the user: Open Search Tool and Open Reasoning Agent. Open Reasoning Agent interprets the given task and completes it by orchestrating a sequence of actions that includes calling tools, one of which is the Open Search Tool. Open Search Tool is a novel web search tool that outperforms proprietary counterparts. Together with powerful open-source reasoning LLMs, such as DeepSeek-R1, ODS nearly matches and sometimes surpasses the existing state-of-the-art baselines on two benchmarks: SimpleQA and FRAMES. For example, on the FRAMES evaluation benchmark, ODS improves the best existing baseline of the recently released GPT-4o Search Preview by 9.7% in accuracy. ODS is a general framework for seamlessly augmenting any LLMs -- for example, DeepSeek-R1 that achieves 82.4% on SimpleQA and 30.1% on FRAMES -- with search and reasoning capabilities to achieve state-of-the-art performance: 88.3% on SimpleQA and 75.3% on FRAMES.
AdaptiVocab: Enhancing LLM Efficiency in Focused Domains through Lightweight Vocabulary Adaptation
Nakash, Itay, Calderon, Nitay, David, Eyal Ben, Hoffer, Elad, Reichart, Roi
Large Language Models (LLMs) have shown impressive versatility as general purpose models. However, their broad applicability comes at a high-cost computational overhead, particularly in auto-regressive decoding where each step requires a forward pass. In domain-specific settings, general-purpose capabilities are unnecessary and can be exchanged for efficiency. In this work, we take a novel perspective on domain adaptation, reducing latency and computational costs by adapting the vocabulary to focused domains of interest. We introduce AdaptiVocab, an end-to-end approach for vocabulary adaptation, designed to enhance LLM efficiency in low-resource domains. AdaptiVocab can be applied to any tokenizer and architecture, modifying the vocabulary by replacing tokens with domain-specific n-gram-based tokens, thereby reducing the number of tokens required for both input processing and output generation. AdaptiVocab initializes new n-token embeddings using an exponentially weighted combination of existing embeddings and employs a lightweight fine-tuning phase that can be efficiently performed on a single GPU. We evaluate two 7B LLMs across three niche domains, assessing efficiency, generation quality, and end-task performance. Our results show that AdaptiVocab reduces token usage by over 25% without compromising performance
LENVIZ: A High-Resolution Low-Exposure Night Vision Benchmark Dataset
Aithal, Manjushree, VidalMata, Rosaura G., Kartha, Manikandtan, Chen, Gong, Adhikarla, Eashan, Kirsten, Lucas N., Fu, Zhicheng, Madhusudhana, Nikhil A., Nasti, Joe
Low-light image enhancement is crucial for a myriad of applications, from night vision and surveillance, to autonomous driving. However, due to the inherent limitations that come in hand with capturing images in low-illumination environments, the task of enhancing such scenes still presents a formidable challenge. To advance research in this field, we introduce our Low Exposure Night Vision (LENVIZ) Dataset, a comprehensive multi-exposure benchmark dataset for low-light image enhancement comprising of over 230K frames showcasing 24K real-world indoor and outdoor, with-and without human, scenes. Captured using 3 different camera sensors, LENVIZ offers a wide range of lighting conditions, noise levels, and scene complexities, making it the largest publicly available up-to 4K resolution benchmark in the field. LENVIZ includes high quality human-generated ground truth, for which each multi-exposure low-light scene has been meticulously curated and edited by expert photographers to ensure optimal image quality. Furthermore, we also conduct a comprehensive analysis of current state-of-the-art low-light image enhancement techniques on our dataset and highlight potential areas of improvement.
Analyzable Chain-of-Musical-Thought Prompting for High-Fidelity Music Generation
Lam, Max W. Y., Xing, Yijin, You, Weiya, Wu, Jingcheng, Yin, Zongyu, Jiang, Fuqiang, Liu, Hangyu, Liu, Feng, Li, Xingda, Lu, Wei-Tsung, Chen, Hanyu, Feng, Tong, Zhao, Tianwei, Liu, Chien-Hung, Song, Xuchen, Li, Yang, Zhou, Yahui
Autoregressive (AR) models have demonstrated impressive capabilities in generating high-fidelity music. However, the conventional next-token prediction paradigm in AR models does not align with the human creative process in music composition, potentially compromising the musicality of generated samples. To overcome this limitation, we introduce MusiCoT, a novel chain-of-thought (CoT) prompting technique tailored for music generation. MusiCoT empowers the AR model to first outline an overall music structure before generating audio tokens, thereby enhancing the coherence and creativity of the resulting compositions. By leveraging the contrastive language-audio pretraining (CLAP) model, we establish a chain of "musical thoughts", making MusiCoT scalable and independent of human-labeled data, in contrast to conventional CoT methods. Moreover, MusiCoT allows for in-depth analysis of music structure, such as instrumental arrangements, and supports music referencing -- accepting variable-length audio inputs as optional style references. This innovative approach effectively addresses copying issues, positioning MusiCoT as a vital practical method for music prompting. Our experimental results indicate that MusiCoT consistently achieves superior performance across both objective and subjective metrics, producing music quality that rivals state-of-the-art generation models. Our samples are available at https://MusiCoT.github.io/.
A Probabilistic Neuro-symbolic Layer for Algebraic Constraint Satisfaction
Kurscheidt, Leander, Morettin, Paolo, Sebastiani, Roberto, Passerini, Andrea, Vergari, Antonio
In safety-critical applications, guaranteeing the satisfaction of constraints over continuous environments is crucial, e.g., an autonomous agent should never crash into obstacles or go off-road. Neural models struggle in the presence of these constraints, especially when they involve intricate algebraic relationships. To address this, we introduce a differentiable probabilistic layer that guarantees the satisfaction of non-convex algebraic constraints over continuous variables. This probabilistic algebraic layer (PAL) can be seamlessly plugged into any neural architecture and trained via maximum likelihood without requiring approximations. PAL defines a distribution over conjunctions and disjunctions of linear inequalities, parameterized by polynomials. This formulation enables efficient and exact renormalization via symbolic integration, which can be amortized across different data points and easily parallelized on a GPU. We showcase PAL and our integration scheme on a number of benchmarks for algebraic constraint integration and on real-world trajectory data.
Generative Linguistics, Large Language Models, and the Social Nature of Scientific Success
Chomsky (1968: 3) greeted the rise of computing technology with skepticism, arguing that "the kinds of structures that are realizable in terms of [computational methods ] are simply not those that must be postulated to underlie the use of language . " 55 years later, Piantadosi (2023: 15) celebrated the release of ChatGPT by directing that same criticism toward generative linguistic s: "the success of large language models is a failure for generative theories because it goes against virtually all of the principles these theories have espoused . " Chesi ( forthcoming) may not agree with Piantadosi's criticisms, but he does take them as a harbinger of scientific crisis. The minimalist program, hampered by a lack of formal and empirical rigor, has failed to produce a comprehensive, self - consistent theory of syntax. ChatG PT's apparent linguistic competence, in tandem with the success of computational accounts of gradient acceptability and online phenomena, seem to suggest that "generative linguistics no longer dictates the agenda for future linguistic challenges" ( Chesi forthcoming: 2). In order to survive, Chesi warns, generativists need to make progress towards a theory that is based on precisely stated principles and evaluated on a common set of explananda . Chesi's target paper presents the current collision of the worlds as a debate about the intellectual merits of generativist theories. According to Chesi, the success of generativism depends on generativists' ability to resolve their deficits of rigor, so that they can parry the theoretical attacks that language model s have levied against core principles of minimalism. This response argues, contrary to Chesi's framing but consistent with current consensus in the history and sociology of science (Fleck 1935; Kuhn 1962; Mullin s 1975; Latour 1984; Law & Lodge 1984), that the generativist crisis described by Piantadosi and Chesi is social in nature, and cannot be averted by intellectual means.
Towards Long-Range ENSO Prediction with an Explainable Deep Learning Model
Chen, Qi, Cui, Yinghao, Hong, Guobin, Ashok, Karumuri, Pu, Yuchun, Zheng, Xiaogu, Zhang, Xuanze, Zhong, Wei, Zhan, Peng, Wang, Zhonglei
Its evolution is governed by intricate air-sea interactions, posing significant challenges for long-term prediction. In this study, we introduce CTEFNet, a multivariate deep learning model that synergizes convolutional neural networks and transformers to enhance ENSO forecasting. By integrating multiple oceanic and atmospheric predictors, CTEFNet extends the effective forecast lead time to 20 months while mitigating the impact of the spring predictability barrier, outperforming both dynamical models and state-of-the-art deep learning approaches. Furthermore, CTEFNet offers physically meaningful and statistically significant insights through gradient-based sensitivity analysis, revealing the key precursor signals that govern ENSO dynamics, which align with well-established theories and reveal new insights about inter-basin interactions among the Pacific, Atlantic, and Indian Oceans. The CTEFNet's superior predictive skill and interpretable sensitivity assessments underscore its potential for advancing climate prediction. Our findings highlight the importance of multivariate coupling in ENSO evolution and demonstrate the promise of deep learning in capturing complex climate dynamics with enhanced interpretability. 1 Introduction El Ni no-Southern Oscillation (ENSO) is one of the most prominent modes of inter-annual climate variability, characterized by shifts in sea surface temperatures (SST) across the tropical Pacific Ocean and the weakening of equatorial trade winds.
Dynamics of Structured Complex-Valued Hopfield Neural Networks
Garimella, Rama Murthy, Valle, Marcos Eduardo, Vieira, Guilherme, Rayala, Anil, Munugoti, Dileep
In this paper, we explore the dynamics of structured complex-valued Hopfield neural networks (CvHNNs), which arise when the synaptic weight matrix possesses specific structural properties. We begin by analyzing CvHNNs with a Hermitian synaptic weight matrix and establish the existence of four-cycle dynamics in CvHNNs with skew-Hermitian weight matrices operating synchronously. Furthermore, we introduce two new classes of complex-valued matrices: braided Hermitian and braided skew-Hermitian matrices. We demonstrate that CvHNNs utilizing these matrix types exhibit cycles of length eight when operating in full parallel update mode. Finally, we conduct extensive computational experiments on synchronous CvHNNs, exploring other synaptic weight matrix structures. This work was supported in part by the National Council for Scientific and Technological Development (CNPq) under grant no 315820/2021-7, the S ao Paulo Research Foundation (FAPESP) under grant no 2023/03368-0, and the Postdoctoral Researcher Program (PPPD) at the Universidade Estadual de Campinas (UNICAMP). Keywords-- Hopfield neural network, complex-valued neural network, associative memory, braided Hermitian matrix. 1 Introduction Artificial neural networks have been conceived as emulators of the biological neural network synapse process. Their processing units, the artificial neurons, usually act based on input signals received from other neurons or cells. Like a biological neuron firing an electric impulse in the presence of specific chemical components in appropriate concentrations, an artificial neuron fires when certain mathematical conditions are satisfied.
A Systematic Review of EEG-based Machine Intelligence Algorithms for Depression Diagnosis, and Monitoring
Nassibi, Amir, Papavassiliou, Christos, Rakhmatulin, Ildar, Mandic, Danilo, Atashzar, S. Farokh
Depression disorder is a serious health condition that has affected the lives of millions of people around the world. Diagnosis of depression is a challenging practice that relies heavily on subjective studies and, in most cases, suffers from late findings. Electroencephalography (EEG) biomarkers have been suggested and investigated in recent years as a potential transformative objective practice. In this article, for the first time, a detailed systematic review of EEG-based depression diagnosis approaches is conducted using advanced machine learning techniques and statistical analyses. For this, 938 potentially relevant articles (since 1985) were initially detected and filtered into 139 relevant articles based on the review scheme 'preferred reporting items for systematic reviews and meta-analyses (PRISMA).' This article compares and discusses the selected articles and categorizes them according to the type of machine learning techniques and statistical analyses. Algorithms, preprocessing techniques, extracted features, and data acquisition systems are discussed and summarized. This review paper explains the existing challenges of the current algorithms and sheds light on the future direction of the field. This systematic review outlines the issues and challenges in machine intelligence for the diagnosis of EEG depression that can be addressed in future studies and possibly in future wearable technologies.
The Greatest Good Benchmark: Measuring LLMs' Alignment with Utilitarian Moral Dilemmas
Marraffini, Giovanni Franco Gabriel, Cotton, Andrés, Hsueh, Noe Fabian, Fridman, Axel, Wisznia, Juan, Del Corro, Luciano
The question of how to make decisions that maximise the well-being of all persons is very relevant to design language models that are beneficial to humanity and free from harm. We introduce the Greatest Good Benchmark to evaluate the moral judgments of LLMs using utilitarian dilemmas. Our analysis across 15 diverse LLMs reveals consistently encoded moral preferences that diverge from established moral theories and lay population moral standards. Most LLMs have a marked preference for impartial beneficence and rejection of instrumental harm. These findings showcase the 'artificial moral compass' of LLMs, offering insights into their moral alignment.