Goto

Collaborating Authors

 Rozza, Alessandro


A Primal-Dual Online Learning Approach for Dynamic Pricing of Sequentially Displayed Complementary Items under Sale Constraints

arXiv.org Artificial Intelligence

Dynamic Pricing (DP) aims to determine the ideal pricing for a product or service in real-time employing revenue optimization strategies (see, (Rothschild, 1974; Kleinberg and Leighton, 2003; Trovรฒ et al., 2018)). This practice is widely prevalent in various sectors, including airlines, ride-sharing, and retail, owing to its capability to adapt to variables such as demand, competition, and time constraints. Undoubtedly, dynamic pricing is garnering considerable attention from both the industry and the scientific community due to its profound economic impact on businesses. From a scientific standpoint, while early research in this field assumed knowledge of the underlying demand functions, the imperative for real-world AI applications has prompted the scientific community to shift their focus towards uncharted demand scenarios and exploration-exploitation algorithms, as underscored by seminal works (e.g., (Aviv and Pazgal, 2005; Besbes and Zeevi, 2009)). Moreover, research on non-stationary demand functions has made a substantial impact on the field. Specifically, recent studies have concentrated on external non-stationarity factors, such as internal ones driven by the seller's actions (as explored in (Cui et al., 2023)), as well as seasonality (as evidenced in (Besbes and Saure, 2014)).


A survey and taxonomy of loss functions in machine learning

arXiv.org Artificial Intelligence

Most state-of-the-art machine learning techniques revolve around the optimisation of loss functions. Defining appropriate loss functions is therefore critical to successfully solving problems in this field. We present a survey of the most commonly used loss functions for a wide range of different applications, divided into classification, regression, ranking, sample generation and energy based modelling. Overall, we introduce 33 different loss functions and we organise them into an intuitive taxonomy. Each loss function is given a theoretical backing and we describe where it is best used. This survey aims to provide a reference of the most essential loss functions for both beginner and advanced machine learning practitioners.


Maximum entropy exploration in contextual bandits with neural networks and energy based models

arXiv.org Artificial Intelligence

Contextual bandits can solve a huge range of real-world problems. However, current popular algorithms to solve them either rely on linear models, or unreliable uncertainty estimation in non-linear models, which are required to deal with the exploration-exploitation trade-off. Inspired by theories of human cognition, we introduce novel techniques that use maximum entropy exploration, relying on neural networks to find optimal policies in settings with both continuous and discrete action spaces. We present two classes of models, one with neural networks as reward estimators, and the other with energy based models, which model the probability of obtaining an optimal reward given an action. We evaluate the performance of these models in static and dynamic contextual bandit simulation environments. We show that both techniques outperform well-known standard algorithms, where energy based models have the best overall performance. This provides practitioners with new techniques that perform well in static and dynamic settings, and are particularly well suited to non-linear scenarios with continuous action spaces.


Dynamic Graph Convolutional Networks

arXiv.org Machine Learning

Many different classification tasks need to manage structured data, which are usually modeled as graphs. Moreover, these graphs can be dynamic, meaning that the vertices/edges of each graph may change during time. Our goal is to jointly exploit structured data and temporal information through the use of a neural network model. To the best of our knowledge, this task has not been addressed using these kind of architectures. For this reason, we propose two novel approaches, which combine Long Short-Term Memory networks and Graph Convolutional Networks to learn long short-term dependencies together with graph structure. The quality of our methods is confirmed by the promising results achieved.


Making Deep Neural Networks Robust to Label Noise: a Loss Correction Approach

arXiv.org Machine Learning

We present a theoretically grounded approach to train deep neural networks, including recurrent networks, subject to class-dependent label noise. We propose two procedures for loss correction that are agnostic to both application domain and network architecture. They simply amount to at most a matrix inversion and multiplication, provided that we know the probability of each class being corrupted into another. We further show how one can estimate these probabilities, adapting a recent technique for noise estimation to the multi-class setting, and thus providing an end-to-end framework. Extensive experiments on MNIST, IMDB, CIFAR-10, CIFAR-100 and a large scale dataset of clothing images employing a diversity of architectures --- stacking dense, convolutional, pooling, dropout, batch normalization, word embedding, LSTM and residual layers --- demonstrate the noise robustness of our proposals. Incidentally, we also prove that, when ReLU is the only non-linearity, the loss curvature is immune to class-dependent label noise.


DANCo: Dimensionality from Angle and Norm Concentration

arXiv.org Machine Learning

In the last decades the estimation of the intrinsic dimensionality of a dataset has gained considerable importance. Despite the great deal of research work devoted to this task, most of the proposed solutions prove to be unreliable when the intrinsic dimensionality of the input dataset is high and the manifold where the points lie is nonlinearly embedded in a higher dimensional space. In this paper we propose a novel robust intrinsic dimensionality estimator that exploits the twofold complementary information conveyed both by the normalized nearest neighbor distances and by the angles computed on couples of neighboring points, providing also closed-forms for the Kullback-Leibler divergences of the respective distributions. Experiments performed on both synthetic and real datasets highlight the robustness and the effectiveness of the proposed algorithm when compared to state of the art methodologies.