Goto

Collaborating Authors

 pearson coefficient


Machine Learning Models for Dengue Forecasting in Singapore

Lai, Zi Iun, Fung, Wai Kit, Chew, Enquan

arXiv.org Artificial Intelligence

With emerging prevalence beyond traditionally endemic regions, the global burden of dengue disease is forecasted to be one of the fastest growing. With limited direct treatment or vaccination currently available, prevention through vector control is widely believed to be the most effective form of managing outbreaks. This study examines traditional state space models (moving average, autoregressive, ARIMA, SARIMA), supervised learning techniques (XGBoost, SVM, KNN) and deep networks (LSTM, CNN, ConvLSTM) for forecasting weekly dengue cases in Singapore. Meteorological data and search engine trends were included as features for ML techniques. Forecasts using CNNs yielded lowest RMSE in weekly cases in 2019.


Demolition and Reinforcement of Memories in Spin-Glass-like Neural Networks

Ventura, Enrico

arXiv.org Machine Learning

Statistical mechanics has made significant contributions to the study of biological neural systems by modeling them as recurrent networks of interconnected units with adjustable interactions. Several algorithms have been proposed to optimize the neural connections to enable network tasks such as information storage (i.e. associative memory) and learning probability distributions from data (i.e. generative modeling). Among these methods, the Unlearning algorithm, aligned with emerging theories of synaptic plasticity, was introduced by John Hopfield and collaborators. The primary objective of this thesis is to understand the effectiveness of Unlearning in both associative memory models and generative models. Initially, we demonstrate that the Unlearning algorithm can be simplified to a linear perceptron model which learns from noisy examples featuring specific internal correlations. The selection of structured training data enables an associative memory model to retrieve concepts as attractors of a neural dynamics with considerable basins of attraction. Subsequently, a novel regularization technique for Boltzmann Machines is presented, proving to outperform previously developed methods in learning hidden probability distributions from data-sets. The Unlearning rule is derived from this new regularized algorithm and is showed to be comparable, in terms of inferential performance, to traditional Boltzmann-Machine learning.


Language Models Understand Numbers, at Least Partially

Zhu, Fangwei, Dai, Damai, Sui, Zhifang

arXiv.org Artificial Intelligence

Large language models (LLMs) have exhibited impressive competence in various tasks, but their opaque internal mechanisms hinder their use in mathematical problems. In this paper, we study a fundamental question: whether language models understand numbers, a basic element in math. Based on an assumption that LLMs should be capable of compressing numbers in their hidden states to solve mathematical problems, we construct a synthetic dataset comprising addition problems and utilize linear probes to read out input numbers from the hidden states. Experimental results support the existence of compressed numbers in LLMs. However, it is difficult to precisely reconstruct the original numbers, indicating that the compression process may not be lossless. Further experiments show that LLMs can utilize encoded numbers to perform arithmetic computations, and the computational ability scales up with the model size. Our preliminary research suggests that LLMs exhibit a partial understanding of numbers, offering insights for future investigations about the models' mathematical capability.


Integrating Chemical Language and Molecular Graph in Multimodal Fused Deep Learning for Drug Property Prediction

Lu, Xiaohua, Xie, Liangxu, Xu, Lei, Mao, Rongzhi, Chang, Shan, Xu, Xiaojun

arXiv.org Artificial Intelligence

Accurately predicting molecular properties is a challenging but essential task in drug discovery. Recently, many mono-modal deep learning methods have been successfully applied to molecular property prediction. However, the inherent limitation of mono-modal learning arises from relying solely on one modality of molecular representation, which restricts a comprehensive understanding of drug molecules and hampers their resilience against data noise. To overcome the limitations, we construct multimodal deep learning models to cover different molecular representations. We convert drug molecules into three molecular representations, SMILES-encoded vectors, ECFP fingerprints, and molecular graphs. To process the modal information, Transformer-Encoder, bi-directional gated recurrent units (BiGRU), and graph convolutional network (GCN) are utilized for feature learning respectively, which can enhance the model capability to acquire complementary and naturally occurring bioinformatics information. We evaluated our triple-modal model on six molecule datasets. Different from bi-modal learning models, we adopt five fusion methods to capture the specific features and leverage the contribution of each modal information better. Compared with mono-modal models, our multimodal fused deep learning (MMFDL) models outperform single models in accuracy, reliability, and resistance capability against noise. Moreover, we demonstrate its generalization ability in the prediction of binding constants for protein-ligand complex molecules in the refined set of PDBbind. The advantage of the multimodal model lies in its ability to process diverse sources of data using proper models and suitable fusion methods, which would enhance the noise resistance of the model while obtaining data diversity.


Sim-to-Real Vision-depth Fusion CNNs for Robust Pose Estimation Aboard Autonomous Nano-quadcopter

Crupi, Luca, Cereda, Elia, Giusti, Alessandro, Palossi, Daniele

arXiv.org Artificial Intelligence

Nano-quadcopters are versatile platforms attracting the interest of both academia and industry. Their tiny form factor, i.e., $\,$10 cm diameter, makes them particularly useful in narrow scenarios and harmless in human proximity. However, these advantages come at the price of ultra-constrained onboard computational and sensorial resources for autonomous operations. This work addresses the task of estimating human pose aboard nano-drones by fusing depth and images in a novel CNN exclusively trained in simulation yet capable of robust predictions in the real world. We extend a commercial off-the-shelf (COTS) Crazyflie nano-drone -- equipped with a 320$\times$240 px camera and an ultra-low-power System-on-Chip -- with a novel multi-zone (8$\times$8) depth sensor. We design and compare different deep-learning models that fuse depth and image inputs. Our models are trained exclusively on simulated data for both inputs, and transfer well to the real world: field testing shows an improvement of 58% and 51% of our depth+camera system w.r.t. a camera-only State-of-the-Art baseline on the horizontal and angular mean pose errors, respectively. Our prototype is based on COTS components, which facilitates reproducibility and adoption of this novel class of systems.


Embedding-based neural network for investment return prediction

Zhu, Jianlong, Xian, Dan, Fengxiao, null, Nie, Yichen

arXiv.org Artificial Intelligence

In addition to being familiar with policies, high investment returns also require extensive knowledge of relevant industry knowledge and news. In addition, it is necessary to leverage relevant theories for investment to make decisions, thereby amplifying investment returns. A effective investment return estimate can feedback the future rate of return of investment behavior. In recent years, deep learning are developing rapidly, and investment return prediction based on deep learning has become an emerging research topic. This paper proposes an embedding-based dual branch approach to predict an investment's return. This approach leverages embedding to encode the investment id into a low-dimensional dense vector, thereby mapping high-dimensional data to a low-dimensional manifold, so that highdimensional features can be represented competitively. In addition, the dual branch model realizes the decoupling of features by separately encoding different information in the two branches. In addition, the swish activation function further improves the model performance. Our approach are validated on the Ubiquant Market Prediction dataset. The results demonstrate the superiority of our approach compared to Xgboost, Lightgbm and Catboost.


Powerful 'Trick' to choose right models in Ensemble Learning

#artificialintelligence

I hope you've followed my previous articles on ensemble modeling. In this article, I'll share a crucial trick helpful to build models using ensemble learning. 'How to choose the right models for your ensemble process?' Imagine the following scenario (great, if you can relate to it). You are working on a classification problem and have built 1000 machine learning models. Each of the model gives you an AUC in the range of 0.7 to 0.75 .


On Clustering Time Series Using Euclidean Distance and Pearson Correlation

Berthold, Michael R., Höppner, Frank

arXiv.org Machine Learning

For time series comparisons, it has often been observed that z-score normalized Euclidean distances far outperform the unnormalized variant. In this paper we show that a z-score normalized, squared Euclidean Distance is, in fact, equal to a distance based on Pearson Correlation. This has profound impact on many distance-based classification or clustering methods. In addition to this theoretically sound result we also show that the often used k-Means algorithm formally needs a mod ification to keep the interpretation as Pearson correlation strictly valid. Experimental results demonstrate that in many cases the standard k-Means algorithm generally produces the same results.


Distance Correlation Methods for Discovering Associations in Large Astrophysical Databases

Martinez-Gomez, Elizabeth, Richards, Mercedes T., Richards, Donald St. P.

arXiv.org Machine Learning

High-dimensional, large-sample astrophysical databases of galaxy clusters, such as the Chandra Deep Field South COMBO-17 database, provide measurements on many variables for thousands of galaxies and a range of redshifts. Current understanding of galaxy formation and evolution rests sensitively on relationships between different astrophysical variables; hence an ability to detect and verify associations or correlations between variables is important in astrophysical research. In this paper, we apply a recently defined statistical measure called the distance correlation coefficient which can be used to identify new associations and correlations between astrophysical variables. The distance correlation coefficient applies to variables of any dimension; it can be used to determine smaller sets of variables that provide equivalent astrophysical information; it is zero only when variables are independent; and it is capable of detecting nonlinear associations that are undetectable by the classical Pearson correlation coefficient. Hence, the distance correlation coefficient provides more information than the Pearson coefficient. We analyze numerous pairs of variables in the COMBO-17 database with the distance correlation method and with the maximal information coefficient. We show that the Pearson coefficient can be estimated with higher accuracy from the corresponding distance correlation coefficient than from the maximal information coefficient. For given values of the Pearson coefficient, the distance correlation method has a greater ability than the maximal information coefficient to resolve astrophysical data into highly concentrated V-shapes, which enhances classification and pattern identification. These results are observed over a range of redshifts beyond the local universe and for galaxies from elliptical to spiral.