AITopics

2209.13635

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)
Asia > Macao (0.04)
Asia > China (0.04)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Saef, Danial, Wang, Yuanrong, Aste, Tomaso

Regime-based Implied Stochastic Volatility Model for Crypto Option Pricing

arXiv.org Artificial IntelligenceSep-27-2022

The increasing adoption of Digital Assets (DAs), such as Bitcoin (BTC), rises the need for accurate option pricing models. Yet, existing methodologies fail to cope with the volatile nature of the emerging DAs. Many models have been proposed to address the unorthodox market dynamics and frequent disruptions in the microstructure caused by the non-stationarity, and peculiar statistics, in DA markets. However, they are either prone to the curse of dimensionality, as additional complexity is required to employ traditional theories, or they overfit historical patterns that may never repeat. Instead, we leverage recent advances in market regime (MR) clustering with the Implied Stochastic Volatility Model (ISVM). Time-regime clustering is a temporal clustering method, that clusters the historic evolution of a market into different volatility periods accounting for non-stationarity. ISVM can incorporate investor expectations in each of the sentiment-driven periods by using implied volatility (IV) data. In this paper, we applied this integrated time-regime clustering and ISVM method (termed MR-ISVM) to high-frequency data on BTC options at the popular trading platform Deribit. We demonstrate that MR-ISVM contributes to overcome the burden of complex adaption to jumps in higher order characteristics of option pricing models. This allows us to price the market based on the expectations of its participants in an adaptive fashion.

data mining, machine learning, volatility, (15 more...)

2208.12614

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > e-Commerce > Financial Technology (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Egorov, Sergey, Gevorgyan, Narek, Zaytsev, Alexey

Self-supervised similarity models based on well-logging data

Adopting data-based approaches leads to model improvement in numerous Oil&Gas logging data processing problems. These improvements become even more sound due to new capabilities provided by deep learning. However, usage of deep learning is limited to areas where researchers possess large amounts of high-quality data. We present an approach that provides universal data representations suitable for solutions to different problems for different oil fields with little additional data. Our approach relies on the self-supervised methodology for sequential logging data for intervals from well, so it also doesn't require labelled data from the start. For validation purposes of the received representations, we consider classification and clusterization problems. We as well consider the transfer learning scenario. We found out that using the variational autoencoder leads to the most reliable and accurate models. approach We also found that a researcher only needs a tiny separate data set for the target oil field to solve a specific problem on top of universal representations.

artificial intelligence, machine learning, representation, (15 more...)

2209.12444

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Miasnikof, Pierre, Bagherbeik, Mohammad, Sheikholeslami, Ali

Graph clustering with Boltzmann machines

arXiv.org Machine LearningSep-26-2022

Graph clustering is the process of grouping vertices into densely connected sets called clusters. We tailor two mathematical programming formulations from the literature, to this problem. In doing so, we obtain a heuristic approximation to the intra-cluster density maximization problem. We use two variations of a Boltzmann machine heuristic to obtain numerical solutions. For benchmarking purposes, we compare solution quality and computational performances to those obtained using a commercial solver, Gurobi. We also compare clustering quality to the clusters obtained using the popular Louvain modularity maximization method. Our initial results clearly demonstrate the superiority of our problem formulations. They also establish the superiority of the Boltzmann machine over the traditional exact solver. In the case of smaller less complex graphs, Boltzmann machines provide the same solutions as Gurobi, but with solution times that are orders of magnitude lower. In the case of larger and more complex graphs, Gurobi fails to return meaningful results within a reasonable time frame. Finally, we also note that both our clustering formulations, the distance minimization and $K$-medoids, yield clusters of superior quality to those obtained with the Louvain algorithm.

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Machine Learning

doi: 10.1016/j.dam.2023.10.012

2203.02471

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Li, Qinglin, Qiu, Guoping

Improving Image Clustering through Sample Ranking and Its Application to remote--sensing images

Image clustering is a very useful technique that is widely applied to various areas, including remote sensing. Recently, visual representations by self-supervised learning have greatly improved the performance of image clustering. To further improve the well-trained clustering models, this paper proposes a novel method by first ranking samples within each cluster based on the confidence in their belonging to the current cluster and then using the ranking to formulate a weighted cross-entropy loss to train the model. For ranking the samples, we developed a method for computing the likelihood of samples belonging to the current clusters based on whether they are situated in densely populated neighborhoods, while for training the model, we give a strategy for weighting the ranked samples. We present extensive experimental results that demonstrate that the new technique can be used to improve the State-of-the-Art image clustering models, achieving accuracy performance gains ranging from $2.1\%$ to $15.9\%$. Performing our method on a variety of datasets from remote sensing, we show that our method can be effectively applied to remote--sensing images.

artificial intelligence, machine learning, proceedings, (15 more...)

doi: 10.3390/rs14143317

2209.12621

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > China > Guangdong Province > Shenzhen (0.05)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
(29 more...)

Genre: Research Report (0.70)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)
Leisure & Entertainment > Sports (0.70)
Transportation > Ground (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Lenssen, Lars, Schubert, Erich

Clustering by Direct Optimization of the Medoid Silhouette

The evaluation of clustering results is difficult, highly dependent on the evaluated data set and the perspective of the beholder. There are many different clustering quality measures, which try to provide a general measure to validate clustering results. A very popular measure is the Silhouette. We discuss the efficient medoid-based variant of the Silhouette, perform a theoretical analysis of its properties, and provide two fast versions for the direct optimization. We combine ideas from the original Silhouette with the well-known PAM algorithm and its latest improvements FasterPAM. One of the versions guarantees equal results to the original variant and provides a run speedup of $O(k^2)$. In experiments on real data with 30000 samples and $k$=100, we observed a 10464$\times$ speedup compared to the original PAMMEDSIL algorithm.

artificial intelligence, machine learning, silhouette, (17 more...)

doi: 10.1007/978-3-031-17849-8_15

2209.12553

Country: Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)

Genre: Research Report (0.40)

Industry: Health & Medicine (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)

5-Star Hotel Customer Satisfaction Analysis Using Hybrid Methodology

Yoo, Yongmin, Park, Yeongjoon, Lim, Dongjin, Seo, Deaho

Due to the rapid development of non-face-to-face services due to the corona virus, commerce through the Internet, such as sales and reservations, is increasing very rapidly. Consumers also post reviews, suggestions, or judgments about goods or services on the website. The review data directly used by consumers provides positive feedback and nice impact to consumers, such as creating business value. Therefore, analysing review data is very important from a marketing point of view. Our research suggests a new way to find factors for customer satisfaction through review data. We applied a method to find factors for customer satisfaction by mixing and using the data mining technique, which is a big data analysis method, and the natural language processing technique, which is a language processing method, in our research. Unlike many studies on customer satisfaction that have been conducted in the past, our research has a novelty of the thesis by using various techniques. And as a result of the analysis, the results of our experiments were very accurate.

data mining, machine learning, natural language, (19 more...)

2209.12417

Country:

Europe > United Kingdom (0.04)
Asia > Middle East > UAE (0.04)
Asia > Middle East > Saudi Arabia (0.04)
(7 more...)

Genre: Research Report > New Finding (0.86)

Industry: Consumer Products & Services > Hotels (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.96)

Ferrante, Matteo, Boccato, Tommaso, Spasov, Simeon, Duggento, Andrea, Toschi, Nicola

VAESim: A probabilistic approach for self-supervised prototype discovery

arXiv.org Artificial IntelligenceSep-25-2022

In medicine, curated image datasets often employ discrete labels to describe what is known to be a continuous spectrum of healthy to pathological conditions, such as e.g. the Alzheimer's Disease Continuum or other areas where the image plays a pivotal point in diagnosis. We propose an architecture for image stratification based on a conditional variational autoencoder. Our framework, VAESim, leverages a continuous latent space to represent the continuum of disorders and finds clusters during training, which can then be used for image/patient stratification. The core of the method learns a set of prototypical vectors, each associated with a cluster. First, we perform a soft assignment of each data sample to the clusters. Then, we reconstruct the sample based on a similarity measure between the sample embedding and the prototypical vectors of the clusters. To update the prototypical embeddings, we use an exponential moving average of the most similar representations between actual prototypes and samples in the batch size. We test our approach on the MNIST-handwritten digit dataset and on a medical benchmark dataset called PneumoniaMNIST. We demonstrate that our method outperforms baselines in terms of kNN accuracy measured on a classification task against a standard VAE (up to 15% improvement in performance) in both datasets, and also performs at par with classification models trained in a fully supervised way. We also demonstrate how our model outperforms current, end-to-end models for unsupervised stratification.

artificial intelligence, latent space, machine learning, (19 more...)

2209.12279

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States (0.04)

Genre: Research Report (0.51)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.54)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Brenner, Aron, Khorramfar, Rahman, Mallapragada, Dharik, Amin, Saurabh

Graph Representation Learning for Energy Demand Data: Application to Joint Energy System Planning under Emissions Constraints

arXiv.org Artificial IntelligenceSep-24-2022

A rapid transformation of current electric power and natural gas (NG) infrastructure is imperative to meet the mid-century goal of CO2 emissions reduction requires. This necessitates a long-term planning of the joint power-NG system under representative demand and supply patterns, operational constraints, and policy considerations. Our work is motivated by the computational and practical challenges associated with solving the generation and transmission expansion problem (GTEP) for joint planning of power-NG systems. Specifically, we focus on efficiently extracting a set of representative days from power and NG data in respective networks and using this set to reduce the computational burden required to solve the GTEP. We propose a Graph Autoencoder for Multiple time resolution Energy Systems (GAMES) to capture the spatio-temporal demand patterns in interdependent networks and account for differences in the temporal resolution of available data. The resulting embeddings are used in a clustering algorithm to select representative days. We evaluate the effectiveness of our approach in solving a GTEP formulation calibrated for the joint power-NG system in New England. This formulation accounts for the physical interdependencies between power and NG systems, including the joint emissions constraint. Our results show that the set of representative days obtained from GAMES not only allows us to tractably solve the GTEP formulation, but also achieves a lower cost of implementing the joint planning decisions.

artificial intelligence, machine learning, representative day, (17 more...)

2209.12035

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England (0.04)
Europe > France (0.04)

Genre: Research Report > New Finding (0.54)

Industry:

Energy > Power Industry (1.00)
Energy > Renewable > Wind (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.90)

#artificialintelligenceSep-23-2022, 07:45:47 GMT

Codeless Machine Learning for Auditors

In the new era of digitalisation, there are emerging challenges relating to the new technology risks while fraud risk increases, adopting new and innovative methods in line with technological progress. In this environment, audit should develop data analytics skills that are beyond traditional risk monitoring and fraud detection tools to meet stakeholders' evolving expectations. Clustering is an unsupervised Machine Learning (ML) algorithm (i.e. an algorithm that learns and improves from experience, without input from users) that looks for patterns in data by dividing it into clusters. These clusters are created such that the points are homogenous within the cluster and heterogenous across clusters. Clustering is commonly used in market segmentation and several areas of marketing analytics as well as in fraud detection.

codeless machine learning, graph, outlier, (7 more...)

#artificialintelligence

Genre: Research Report (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.33)