AITopics | South America

Collaborating Authors

South America

Learning an Optimal Assortment Policy under Observational Data

Han, Yuxuan, Zhong, Han, Lu, Miao, Blanchet, Jose, Zhou, Zhengyuan

arXiv.org Machine LearningFeb-10-2025

We study the fundamental problem of offline assortment optimization under the Multinomial Logit (MNL) model, where sellers must determine the optimal subset of the products to offer based solely on historical customer choice data. While most existing approaches to learning-based assortment optimization focus on the online learning of the optimal assortment through repeated interactions with customers, such exploration can be costly or even impractical in many real-world settings. In this paper, we consider the offline learning paradigm and investigate the minimal data requirements for efficient offline assortment optimization. To this end, we introduce Pessimistic Rank-Breaking (PRB), an algorithm that combines rank-breaking with pessimistic estimation. We prove that PRB is nearly minimax optimal by establishing the tight suboptimality upper bound and a nearly matching lower bound. This further shows that "optimal item coverage" - where each item in the optimal assortment appears sufficiently often in the historical data - is both sufficient and necessary for efficient offline learning. This significantly relaxes the previous requirement of observing the complete optimal assortment in the data. Our results provide fundamental insights into the data requirements for offline assortment optimization under the MNL model.

artificial intelligence, assortment, machine learning, (18 more...)

arXiv.org Machine Learning

2502.06777

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > New York (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Jakiro: Boosting Speculative Decoding with Decoupled Multi-Head via MoE

Huang, Haiduo, Yang, Fuwei, Liu, Zhenhua, Xu, Yixing, Li, Jinze, Liu, Yang, Yin, Xuanwu, Li, Dong, Ren, Pengju, Barsoum, Emad

arXiv.org Artificial IntelligenceFeb-10-2025

Speculative decoding (SD) accelerates large language model inference by using a smaller draft model to predict multiple tokens, which are then verified in parallel by the larger target model. However, the limited capacity of the draft model often necessitates tree-based sampling to improve prediction accuracy, where multiple candidates are generated at each step. We identify a key limitation in this approach: the candidates at the same step are derived from the same representation, limiting diversity and reducing overall effectiveness. To address this, we propose Jakiro, leveraging Mixture of Experts (MoE), where independent experts generate diverse predictions, effectively decoupling correlations among candidates. Furthermore, we introduce a hybrid inference strategy, combining autoregressive decoding for initial tokens with parallel decoding for subsequent stages, and enhance the latter with contrastive mechanism in features to improve accuracy. Our method significantly boosts prediction accuracy and achieves higher inference speedups. Extensive experiments across diverse models validate the effectiveness and robustness of our approach, establishing a new SOTA in speculative decoding. Our codes are available at https://github.com/haiduo/Jakiro.

large language model, machine learning, mechanism, (20 more...)

arXiv.org Artificial Intelligence

2502.06282

Country:

Asia > Middle East > Jordan (0.04)
Asia > China > Shaanxi Province > Xi'an (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(5 more...)

Genre:

Research Report (0.64)
Workflow (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)

Add feedback

Analytical Lyapunov Function Discovery: An RL-based Generative Approach

Zou, Haohan, Feng, Jie, Zhao, Hao, Shi, Yuanyuan

arXiv.org Artificial IntelligenceFeb-10-2025

Despite advances in learning-based methods, finding valid Lyapunov functions for nonlinear dynamical systems remains challenging. Current neural network approaches face two main issues: challenges in scalable verification and limited interpretability. To address these, we propose an end-to-end framework using transformers to construct analytical Lyapunov functions (local), which simplifies formal verification, enhances interpretability, and provides valuable insights for control engineers. Our framework consists of a transformer-based trainer that generates candidate Lyapunov functions and a falsifier that verifies candidate expressions and refines the model via risk-seeking policy gradient. Unlike Alfarano et al. (2024), which utilizes pre-training and seeks global Lyapunov functions for low-dimensional systems, our model is trained from scratch via reinforcement learning (RL) and succeeds in finding local Lyapunov functions for high-dimensional and non-polynomial systems. Given the analytical nature of the candidates, we employ efficient optimization methods for falsification during training and formal verification tools for the final verification. We demonstrate the efficiency of our approach on a range of nonlinear dynamical systems with up to ten dimensions and show that it can discover Lyapunov functions not previously identified in the control literature.

analytical lyapunov function discovery, expression, lyapunov function, (11 more...)

arXiv.org Artificial Intelligence

2502.02014

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Europe > Switzerland (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Electricity Demand Forecasting in Future Grid States: A Digital Twin-Based Simulation Study

Bayer, Daniel R., Haag, Felix, Pruckner, Marco, Hopf, Konstantin

arXiv.org Artificial IntelligenceFeb-10-2025

Short-term forecasting of residential electricity demand is an important task for utilities. Yet, many small and medium-sized utilities still use simple forecasting approaches such as Synthesized Load Profiles, which treat residential households similarly and neither account for renewable energy installations nor novel large consumers (e.g., heat pumps, electric vehicles). The effectiveness of such "one-fits-all" approaches in future grid states--where decentral generation and sector coupling increases--are questionable. Our study challenges these forecasting practices and investigates whether Machine Learning (ML) approaches are suited to predict electricity demand in today's and in future grid states. We use real smart meter data from 3,511 households in Germany over 34 months. We extrapolate this data with future grid states (i.e., increased decentral generation and storage) based on a digital twin of a local energy system. Our results show that Long Short-Term Memory (LSTM) approaches outperform SLPs as well as simple benchmark estimators with up to 68.5% lower Root Mean Squared Error for a day-ahead forecast, especially in future grid states. Nevertheless, all prediction approaches perform worse in future grid states. Our findings therefore reinforce the need (a) for utilities and grid operators to employ ML approaches instead of traditional demand prediction methods in future grid states and (b) to prepare current ML methods for future grid states.

digital twin, forecasting, future grid state, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.23919/SpliTech61897.2024.10612563

2503.04757

Country:

Europe > Germany (0.36)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
South America (0.04)
(5 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Energy > Renewable (1.00)
Energy > Power Industry (1.00)
Transportation > Ground > Road (0.87)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEG

Carzaniga, Francesco Stefano, Hoppeler, Gary Tom, Hersche, Michael, Schindler, Kaspar Anton, Rahimi, Abbas

arXiv.org Artificial IntelligenceFeb-10-2025

All data modalities are not created equal, even when the signal they measure comes from the same source. In the case of the brain, two of the most important data modalities are the scalp electroencephalogram (EEG), and the intracranial electroencephalogram (iEEG). Nonetheless, both EEG and iEEG are important sources of data for human neurology, from healthcare to brain-machine interfaces. They are used by human experts, supported by deep learning (DL) models, to accomplish a variety of tasks, such as seizure detection and motor imagery classification. Although the differences between EEG and iEEG are well understood by human experts, the performance of DL models across these two modalities remains under-explored. To help characterize the importance of clean data on the performance of DL models, we propose BrainCodec, a high-fidelity EEG and iEEG neural compressor. We find that training BrainCodec on iEEG and then transferring to EEG yields higher reconstruction quality than training on EEG directly. In addition, we also find that training BrainCodec on both EEG and iEEG improves fidelity when reconstructing EEG. Our work indicates that data sources with higher SNR, such as iEEG, provide better performance across the board also in the medical time-series domain. This finding is consistent with reports coming from natural language processing, where clean data sources appear to have an outsized effect on the performance of the DL model overall. BrainCodec also achieves up to a 64 compression on iEEG and EEG without a notable decrease in quality. We also evaluate the fidelity of the compressed signals objectively on a seizure detection and a motor imagery task performed by standard DL models. Here, we find that BrainCodec achieves a reconstruction fidelity high enough to ensure no performance degradation on the downstream tasks. Finally, we collect the subjective assessment of an expert neurologist, that confirms the high reconstruction quality of BrainCodec in a realistic scenario. Collecting high signal-to-noise ratio (SNR) data can prove to be a challenging endeavor in many situations, especially when considering human data. However, noisier signals are sometimes adequate to perform the task at hand. Following this principle, different data modalities can be collected from the same source with varying levels of quality.

braincodec, compression ratio, dataset, (13 more...)

arXiv.org Artificial Intelligence

2502.17462

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Dynamic Rank Factor Model for Text Streams

Shaobo Han, Lin Du, Esther Salazar, Lawrence Carin

Neural Information Processing SystemsFeb-9-2025, 20:26:26 GMT

We propose a semi-parametric and dynamic rank factor model for topic modeling, capable of (i) discovering topic prevalence over time, and (ii) learning contemporary multi-scale dependence structures, providing topic and word correlations as a byproduct. The high-dimensional and time-evolving ordinal/rank observations (such as word counts), after an arbitrary monotone transformation, are well accommodated through an underlying dynamic sparse factor model. The framework naturally admits heavy-tailed innovations, capable of inferring abrupt temporal jumps in the importance of topics. Posterior inference is performed through straightforward Gibbs sampling, based on the forward-filtering backwardsampling algorithm. Moreover, an efficient data subsampling scheme is leveraged to speed up inference on massive datasets. The modeling framework is illustrated on two real datasets: the US State of the Union Address and the JSTOR collection from Science.

artificial intelligence, correlation, machine learning, (15 more...)

Neural Information Processing Systems

Country:

North America > Mexico (0.29)
North America > Cuba (0.14)
North America > Panama (0.14)
(7 more...)

Industry:

Energy (0.93)
Banking & Finance (0.93)
Government > Regional Government > North America Government > United States Government (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Grouping-Based Low-Rank Trajectory Completion and 3D Reconstruction

Katerina Fragkiadaki, Marta Salas, Pablo Arbelaez, Jitendra Malik

Neural Information Processing SystemsFeb-9-2025, 17:32:14 GMT

Extracting 3D shape of deforming objects in monocular videos, a task known as non-rigid structure-from-motion (NRSfM), has so far been studied only on synthetic datasets and controlled environments. Typically, the objects to reconstruct are pre-segmented, they exhibit limited rotations and occlusions, or full-length trajectories are assumed. In order to integrate NRSfM into current video analysis pipelines, one needs to consider as input realistic -thus incomplete-tracking, and perform spatio-temporal grouping to segment the objects from their surroundings. Furthermore, NRSfM needs to be robust to noise in both segmentation and tracking, e.g., drifting, segmentation "leaking", optical flow "bleeding" etc. In this paper, we make a first attempt towards this goal, and propose a method that combines dense optical flow tracking, motion trajectory clustering and NRSfM for 3D reconstruction of objects in videos. For each trajectory cluster, we compute multiple reconstructions by minimizing the reprojection error and the rank of the 3D shape under different rank bounds of the trajectory matrix. We show that dense 3D shape is extracted and trajectories are completed across occlusions and low textured regions, even under mild relative motion between the object and the camera. We achieve competitive results on a public NRSfM benchmark while using fixed parameters across all sequences and handling incomplete trajectories, in contrast to existing approaches.

artificial intelligence, image understanding, trajectory, (19 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
South America > Colombia > Bogotá D.C. > Bogotá (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Spain > Aragón > Zaragoza Province > Zaragoza (0.04)

Technology: Information Technology > Artificial Intelligence > Vision > Image Understanding (0.48)

Add feedback

Learning on graphs using Orthonormal Representation is Statistically Consistent

Rakesh Shivanna, Chiranjib Bhattacharyya

Neural Information Processing SystemsFeb-9-2025, 17:30:56 GMT

Existing research [4] suggests that embedding graphs on a unit sphere can be beneficial in learning labels on the vertices of a graph. However the choice of optimal embedding remains an open issue. Orthonormal representation of graphs, a class of embeddings over the unit sphere, was introduced by Lov asz [2]. In this paper, we show that there exists orthonormal representations which are statistically consistent over a large class of graphs, including power law and random graphs. This result is achieved by extending the notion of consistency designed in the inductive setting to graph transduction. As part of the analysis, we explicitly derive relationships between the Rademacher complexity measure and structural properties of graphs, such as the chromatic number.

artificial intelligence, graph, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Asia > India > Karnataka > Bengaluru (0.04)
South America > Paraguay > Asunción > Asunción (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Divide-and-Conquer Learning by Anchoring a Conical Hull

Tianyi Zhou, Jeff A. Bilmes, Carlos None Guestrin

Neural Information Processing SystemsFeb-9-2025, 17:09:15 GMT

We reduce a broad class of fundamental machine learning problems, usually addressed by EM or sampling, to the problem of finding the k extreme rays spanning the conical hull of a1 data point set. These k "anchors" lead to a global solution and a more interpretable model that can even outperform EM and sampling on generalization error. To find the k anchors, we propose a novel divide-andconquer learning scheme "DCA" that distributes the problem to O(k log k) sametype sub-problems on different low-D random hyperplanes, each can be solved independently by any existing solver. For the 2D sub-problem, we instead present a non-iterative solver that only needs to compute an array of cosine values and its max/min entries. DCA also provides a faster subroutine inside other algorithms to check whether a point is covered in a conical hull, and thus improves these algorithms by providing significant speedups. We apply our method to GMM, HMM, LDA, NMF and subspace clustering, then show its competitive performance and scalability over other methods on large datasets.

artificial intelligence, conical hull, machine learning, (11 more...)

Neural Information Processing Systems

Country:

South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > New York (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback

Distributed Bayesian Posterior Sampling via Moment Sharing

Minjie Xu, Balaji Lakshminarayanan, Yee Whye Teh, Jun Zhu, Bo Zhang

Neural Information Processing SystemsFeb-9-2025, 16:05:54 GMT

We propose a distributed Markov chain Monte Carlo (MCMC) inference algorithm for large scale Bayesian posterior simulation. We assume that the dataset is partitioned and stored across nodes of a cluster. Our procedure involves an independent MCMC posterior sampler at each node based on its local partition of the data. Moment statistics of the local posteriors are collected from each sampler and propagated across the cluster using expectation propagation message passing with low communication costs. The moment sharing scheme improves posterior estimation quality by enforcing agreement among the samplers. We demonstrate the speed and inference quality of our method with empirical studies on Bayesian logistic regression and sparse linear regression with a spike-and-slab prior.

artificial intelligence, machine learning, posterior, (19 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Massachusetts (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)

Add feedback