AITopics

Reranking plays a crucial role in modern multi-stage recommender systems by rearranging the initial ranking list. Due to the inherent challenges of combinatorial search spaces, some current research adopts an evaluator-generator paradigm, with a generator generating feasible sequences and an evaluator selecting the best sequence based on the estimated list utility. However, these methods still face two issues. Firstly, due to the goal inconsistency problem between the evaluator and generator, the generator tends to fit the local optimal solution of exposure distribution rather than combinatorial space optimization. Secondly, the strategy of generating target items one by one is difficult to achieve optimality because it ignores the information of subsequent items. To address these issues, we propose a utilizing Neighbor Lists model for Generative Reranking (NLGR), which aims to improve the performance of the generator in the combinatorial space. NLGR follows the evaluator-generator paradigm and improves the generator's training and generating methods. Specifically, we use neighbor lists in combination space to enhance the training process, making the generator perceive the relative scores and find the optimization direction. Furthermore, we propose a novel sampling-based non-autoregressive generation method, which allows the generator to jump flexibly from the current list to any neighbor list. Extensive experiments on public and industrial datasets validate NLGR's effectiveness and we have successfully deployed NLGR on the Meituan food delivery platform.

artificial intelligence, generator, machine learning, (17 more...)

doi: 10.1145/3701716.3715251

2502.06097

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
Asia > China > Sichuan Province > Chengdu (0.05)
North America > United States > New York > New York County > New York City (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Services (0.49)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Tran, Hoang-Son, Petrovic, Vladimir, Bardenet, Remi, Ghosh, Subhroshekhar

Negative Dependence as a toolbox for machine learning : review and new developments

arXiv.org Machine LearningFeb-11-2025

Negative dependence is becoming a key driver in advancing learning capabilities beyond the limits of traditional independence. Recent developments have evidenced support towards negatively dependent systems as a learning paradigm in a broad range of fundamental machine learning challenges including optimization, sampling, dimensionality reduction and sparse signal recovery, often surpassing the performance of current methods based on statistical independence. The most popular negatively dependent model has been that of determinantal point processes (DPPs), which have their origins in quantum theory. However, other models, such as perturbed lattice models, strongly Rayleigh measures, zeros of random functions have gained salience in various learning applications. In this article, we review this burgeoning field of research, as it has developed over the past two decades or so. We also present new results on applications of DPPs to the parsimonious representation of neural networks. In the limited scope of the article, we mostly focus on aspects of this area to which the authors contributed over the recent years, including applications to Monte Carlo methods, coresets and stochastic gradient descent, stochastic networks, signal processing and connections to quantum computation. However, starting from basics of negative dependence for the uninitiated reader, extensive references are provided to a broad swath of related developments which could not be covered within our limited scope. While existing works and reviews generally focus on specific negatively dependent models (e.g. DPPs), a notable feature of this article is that it addresses negative dependence as a machine learning methodology as a whole. In this vein, it covers within its span an array of negatively dependent models and their applications well beyond DPPs, thereby putting forward a very general and rather unique perspective.

artificial intelligence, dpp, machine learning, (18 more...)

arXiv.org Machine Learning

2502.07285

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
(13 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

On Memory Construction and Retrieval for Personalized Conversational Agents

Pan, Zhuoshi, Wu, Qianhui, Jiang, Huiqiang, Luo, Xufang, Cheng, Hao, Li, Dongsheng, Yang, Yuqing, Lin, Chin-Yew, Zhao, H. Vicky, Qiu, Lili, Gao, Jianfeng

To deliver coherent and personalized experiences in long-term conversations, existing approaches typically perform retrieval augmented response generation by constructing memory banks from conversation history at either the turn-level, session-level, or through summarization techniques. In this paper, we present two key findings: (1) The granularity of memory unit matters: Turn-level, session-level, and summarization-based methods each exhibit limitations in both memory retrieval accuracy and the semantic quality of the retrieved content. (2) Prompt compression methods, such as \textit{LLMLingua-2}, can effectively serve as a denoising mechanism, enhancing memory retrieval accuracy across different granularities. Building on these insights, we propose SeCom, a method that constructs a memory bank with topical segments by introducing a conversation Segmentation model, while performing memory retrieval based on Compressed memory units. Experimental results show that SeCom outperforms turn-level, session-level, and several summarization-based methods on long-term conversation benchmarks such as LOCOMO and Long-MT-Bench+. Additionally, the proposed conversation segmentation method demonstrates superior performance on dialogue segmentation datasets such as DialSeg711, TIAGE, and SuperDialSeg.

large language model, machine learning, natural language, (20 more...)

2502.05589

Country:

Europe > Western Europe (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > Pennsylvania (0.04)
(5 more...)

Genre: Research Report > New Finding (0.87)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Health & Medicine (0.66)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Varela, Guilherme S., Sardinha, Alberto, Melo, Francisco S.

Distributed Value Decomposition Networks with Networked Agents

We investigate the problem of distributed training under partial observability, whereby cooperative multi-agent reinforcement learning agents (MARL) maximize the expected cumulative joint reward. We propose distributed value decomposition networks (DVDN) that generate a joint Q-function that factorizes into agent-wise Q-functions. Whereas the original value decomposition networks rely on centralized training, our approach is suitable for domains where centralized training is not possible and agents must learn by interacting with the physical environment in a decentralized manner while communicating with their peers. DVDN overcomes the need for centralized training by locally estimating the shared objective. We contribute with two innovative algorithms, DVDN and DVDN (GT), for the heterogeneous and homogeneous agents settings respectively. Empirically, both algorithms approximate the performance of value decomposition networks, in spite of the information loss during communication, as demonstrated in ten MARL tasks in three standard environments.

agent, algorithm, dvdn, (14 more...)

2502.07635

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Lastrucci, Giacomo, Schweidtmann, Artur M.

ENFORCE: Exact Nonlinear Constrained Learning with Adaptive-depth Neural Projection

Ensuring neural networks adhere to domain-specific constraints is crucial for addressing safety and ethical concerns while also enhancing prediction accuracy. Despite the nonlinear nature of most real-world tasks, existing methods are predominantly limited to affine or convex constraints. We introduce ENFORCE, a neural network architecture that guarantees predictions to satisfy nonlinear constraints exactly. ENFORCE is trained with standard unconstrained gradient-based optimizers (e.g., Adam) and leverages autodifferentiation and local neural projections to enforce any $\mathcal{C}^1$ constraint to arbitrary tolerance $\epsilon$. We build an adaptive-depth neural projection (AdaNP) module that dynamically adjusts its complexity to suit the specific problem and the required tolerance levels. ENFORCE guarantees satisfaction of equality constraints that are nonlinear in both inputs and outputs of the neural network with minimal (and adjustable) computational cost.

constraint, neural network, projection, (16 more...)

2502.06774

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Netherlands > South Holland > Delft (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
(5 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Uncertainty Quantification and Decomposition for LLM-based Recommendation

Kweon, Wonbin, Jang, Sanghwan, Kang, SeongKu, Yu, Hwanjo

Instruction-tuned for recommendation, we demonstrate that LLMs often exhibit uncertainty LLMs [4, 29, 64, 66] have shown remarkable performance for the in their recommendations. To ensure the trustworthy zero-shot ranking task [23, 25], and can be further fine-tuned with use of LLMs in generating recommendations, we emphasize the the user history logged on the system [2, 19, 81]. Recent methods importance of assessing the reliability of recommendations generated [10, 70, 79, 80] adopt the retrieval-augmented generation paradigm by LLMs. We start by introducing a novel framework for [3, 27], where LLMs are employed to generate ranking lists with candidates estimating the predictive uncertainty to quantitatively measure the retrieved by candidate generators. This approach exhibits reliability of LLM-based recommendations. We further propose to state-of-the-art recommendation performance over conventional decompose the predictive uncertainty into recommendation uncertainty sequential recommenders [31, 63], facilitating better online updates and prompt uncertainty, enabling in-depth analyses of and avoiding hallucination. the primary source of uncertainty. Through extensive experiments, While LLMs have been widely employed in real-world applications we (1) demonstrate predictive uncertainty effectively indicates the that can influence human behavior, there is a lack of exploration reliability of LLM-based recommendations, (2) investigate the origins in assessing the reliability of the LLM-based recommendation. of uncertainty with decomposed uncertainty measures, and Indeed, despite their superior performance, we demonstrate recommendations (3) propose uncertainty-aware prompting for a lower predictive generated by LLMs are highly volatile depending on uncertainty and enhanced recommendation. Our source code and the prompting details (e.g., word choice, number of user histories, model weights are available at https://github.com/WonbinKweon/

predictive uncertainty, recommendation, recommendation performance, (13 more...)

doi: 10.1145/3696410.3714601

2501.1763

Country:

Oceania > Australia > New South Wales > Sydney (0.05)
Asia > South Korea > Gyeongsangbuk-do > Pohang (0.05)
North America > United States > New York > New York County > New York City (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Abreu, Steven, Shrestha, Sumit Bam, Zhu, Rui-Jie, Eshraghian, Jason

Neuromorphic Principles for Efficient Large Language Models on Intel Loihi 2

Large language models (LLMs) deliver impressive performance but require large amounts of energy. In this work, we present a MatMul-free LLM architecture adapted for Intel's neuromorphic processor, Loihi 2. Our approach leverages Loihi 2's support for low-precision, event-driven computation and stateful processing. Our hardware-aware quantized model on GPU demonstrates that a 370M parameter MatMul-free model can be quantized with no accuracy loss. Based on preliminary results, we report up to 3x higher throughput with 2x less energy, compared to transformer-based LLMs on an edge GPU, with significantly better scaling. Further hardware optimizations will increase throughput and decrease energy consumption. These results show the potential of neuromorphic hardware for efficient inference and pave the way for efficient reasoning models capable of generating complex, long-form text rapidly and cost-effectively.

language model, loihi 2, throughput, (16 more...)

2503.18002

Country:

North America > United States > California > Santa Cruz County > Santa Cruz (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Italy > Tuscany > Florence (0.04)
Asia > Middle East > Qatar > Ad-Dawhah > Doha (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Energy (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Karpekov, Alexander, Chernova, Sonia, Plötz, Thomas

DISCOVER: Data-driven Identification of Sub-activities via Clustering and Visualization for Enhanced Activity Recognition in Smart Homes

Human Activity Recognition (HAR) using ambient sensors has great potential for practical applications, particularly in elder care and independent living. However, deploying HAR systems in real-world settings remains challenging due to the high cost of labeled data, the need for pre-segmented sensor streams, and the lack of flexibility in activity granularity. To address these limitations, we introduce DISCOVER, a method designed to discover fine-grained human sub-activities from unlabeled sensor data without relying on pre-segmentation. DISCOVER combines unsupervised feature extraction and clustering with a user-friendly visualization tool to streamline the labeling process. DISCOVER enables domain experts to efficiently annotate only a minimal set of representative cluster centroids, reducing the annotation workload to a small number of samples (0.05% of our dataset). We demonstrate DISCOVER's effectiveness through a re-annotation exercise on widely used HAR datasets, showing that it uncovers finer-grained activities and produces more nuanced annotations than traditional coarse labels. DISCOVER represents a step toward practical, deployable HAR systems that adapt to diverse real environments.

dataset, sequence, up-to-date information, (15 more...)

2503.01733

Country:

North America > Aruba (0.06)
Africa > Middle East > Egypt > Cairo Governorate > Cairo (0.06)
North America > United States > New York > New York County > New York City (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Smart Houses & Appliances (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.67)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(4 more...)

Bridging Brain Signals and Language: A Deep Learning Approach to EEG-to-Text Decoding

Gedawy, Mostafa El, Nabil, Omnia, Mamdouh, Omar, Nady, Mahmoud, Adel, Nour Alhuda, Fares, Ahmed

Brain activity translation into human language delivers the capability to revolutionize machine-human interaction while providing communication support to people with speech disability. Electronic decoding reaches a certain level of achievement yet current EEG-to-text decoding methods fail to reach open vocabularies and depth of meaning and individual brain-specific variables. We introduce a special framework which changes conventional closed-vocabulary EEG-to-text decoding approaches by integrating subject-specific learning models with natural language processing methods to resolve detection obstacles. This method applies a deep representation learning approach to extract important EEG features which allow training of neural networks to create elaborate sentences that extend beyond original data content. The ZuCo dataset analysis demonstrates that research findings achieve higher BLEU, ROUGE and BERTScore performance when compared to current methods. The research proves how this framework functions as an effective approach to generate meaningful and correct texts while understanding individual brain variations. The proposed research aims to create a connection between open-vocabulary Text generation systems and human brain signal interpretation for developing efficacious brain-to-text systems. The research produces interdisciplinary effects through innovative assistive technology development and personalized communication systems which extend possibilities for human-computer interaction in various settings.

eeg signal, module, representation, (13 more...)

2502.17465

Country:

North America > United States > Florida > Dade County (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
Africa > Middle East > Egypt (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-10-2025, 15:59:57 GMT

A Dataset Analysis

Figure 8: AVA samples annotated with an aesthetic score of 5, whose sentiment score we propose varies between 0.39 and 0.98. For each image we report the overall sentiment score (top of the image) and comments with the corresponding predicted sentiment score in bold.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Oceania > Australia (0.04)
North America > United States (0.04)
North America > Canada (0.04)
(3 more...)

Industry:

Information Technology (1.00)
Media > Photography (0.93)
Law (0.93)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)