Accuracy
Applying Machine Learning to Life Insurance: some knowledge sharing to master it
Chancel, Antoine, Bradier, Laura, Ly, Antoine, Ionescu, Razvan, Martin, Laurene, Sauce, Marguerite
Machine Learning permeates many industries, which brings new source of benefits for companies. However within the life insurance industry, Machine Learning is not widely used in practice as over the past years statistical models have shown their efficiency for risk assessment. Thus insurers may face difficulties to assess the value of the artificial intelligence. Focusing on the modification of the life insurance industry over time highlights the stake of using Machine Learning for insurers and benefits that it can bring by unleashing data value. This paper reviews traditional actuarial methodologies for survival modeling and extends them with Machine Learning techniques. It points out differences with regular machine learning models and emphasizes importance of specific implementations to face censored data with machine learning models family. In complement to this article, a Python library has been developed. Different open-source Machine Learning algorithms have been adjusted to adapt the specificities of life insurance data, namely censoring and truncation. Such models can be easily applied from this SCOR library to accurately model life insurance risks.
Multiclass Sentiment Prediction for Stock Trading
Python was used to download and format NewsAPI article data relating to 400 publicly traded, low cap. Biotech companies. Crowd-sourcing was used to label a subset of this data to then train and evaluate a variety of models to classify the public sentiment of each company. The best performing models were then used to show that trading entirely off public sentiment could provide market beating returns.
Mitigating Attacks on Artificial Intelligence-based Spectrum Sensing for Cellular Network Signals
Catak, Ferhat Ozgur, Kuzlu, Murat, Sarp, Salih, Catak, Evren, Cali, Umit
Cellular networks (LTE, 5G, and beyond) are dramatically growing with high demand from consumers and more promising than the other wireless networks with advanced telecommunication technologies. The main goal of these networks is to connect billions of devices, systems, and users with high-speed data transmission, high cell capacity, and low latency, as well as to support a wide range of new applications, such as virtual reality, metaverse, telehealth, online education, autonomous and flying vehicles, advanced manufacturing, and many more. To achieve these goals, spectrum sensing has been paid more attention, along with new approaches using artificial intelligence (AI) methods for spectrum management in cellular networks. This paper provides a vulnerability analysis of spectrum sensing approaches using AI-based semantic segmentation models for identifying cellular network signals under adversarial attacks with and without defensive distillation methods. The results showed that mitigation methods can significantly reduce the vulnerabilities of AI-based spectrum sensing models against adversarial attacks.
Consensus Knowledge Graph Learning via Multi-view Sparse Low Rank Block Model
Cai, Tianxi, Xia, Dong, Zhang, Luwan, Zhou, Doudou
Network analysis has been a powerful tool to unveil relationships and interactions among a large number of objects. Yet its effectiveness in accurately identifying important node-node interactions is challenged by the rapidly growing network size, with data being collected at an unprecedented granularity and scale. Common wisdom to overcome such high dimensionality is collapsing nodes into smaller groups and conducting connectivity analysis on the group level. Dividing efforts into two phases inevitably opens a gap in consistency and drives down efficiency. Consensus learning emerges as a new normal for common knowledge discovery with multiple data sources available. To this end, this paper features developing a unified framework of simultaneous grouping and connectivity analysis by combining multiple data sources. The algorithm also guarantees a statistically optimal estimator.
Robust Causality and False Attribution in Data-Driven Earth Science Discoveries
Eldhose, Elizabeth, Chauhan, Tejasvi, Chandel, Vikram, Ghosh, Subimal, Ganguly, Auroop R.
Causal and attribution studies are essential for earth scientific discoveries and critical for informing climate, ecology, and water policies. However, the current generation of methods needs to keep pace with the complexity of scientific and stakeholder challenges and data availability combined with the adequacy of data-driven methods. Unless carefully informed by physics, they run the risk of conflating correlation with causation or getting overwhelmed by estimation inaccuracies. Given that natural experiments, controlled trials, interventions, and counterfactual examinations are often impractical, information-theoretic methods have been developed and are being continually refined in the earth sciences. Here we show that transfer entropy-based causal graphs, which have recently become popular in the earth sciences with high-profile discoveries, can be spurious even when augmented with statistical significance. We develop a subsample-based ensemble approach for robust causality analysis. Simulated data, and observations in climate and ecohydrology, suggest the robustness and consistency of this approach.
Optical Neural Ordinary Differential Equations
Zhao, Yun, Chen, Hang, Lin, Min, Zhang, Haiou, Yan, Tao, Lin, Xing, Huang, Ruqi, Dai, Qionghai
Increasing the layer number of on-chip photonic neural networks (PNNs) is essential to improve its model performance. However, the successively cascading of network hidden layers results in larger integrated photonic chip areas. To address this issue, we propose the optical neural ordinary differential equations (ON-ODE) architecture that parameterizes the continuous dynamics of hidden layers with optical ODE solvers. The ON-ODE comprises the PNNs followed by the photonic integrator and optical feedback loop, which can be configured to represent residual neural networks (ResNet) and recurrent neural networks with effectively reduced chip area occupancy. For the interference-based optoelectronic nonlinear hidden layer, the numerical experiments demonstrate that the single hidden layer ON-ODE can achieve approximately the same accuracy as the two-layer optical ResNet in image classification tasks. Besides, the ONODE improves the model classification accuracy for the diffraction-based all-optical linear hidden layer. The time-dependent dynamics property of ON-ODE is further applied for trajectory prediction with high accuracy.
Zero-Shot Imitating Collaborative Manipulation Plans from YouTube Cooking Videos
Zhang, Hejia, Zhong, Jie, Nikolaidis, Stefanos
People often watch videos on the web to learn how to cook new recipes, assemble furniture or repair a computer. We wish to enable robots with the very same capability. This is challenging; there is a large variation in manipulation actions and some videos even involve multiple persons, who collaborate by sharing and exchanging objects and tools. Furthermore, the learned representations need to be general enough to be transferable to robotic systems. On the other hand, previous work has shown that the space of human manipulation actions has a linguistic, hierarchical structure that relates actions to manipulated objects and tools. Building upon this theory of language for action, we propose a system for understanding and executing demonstrated action sequences from full-length, real-world cooking videos on the web. The system takes as input a new, previously unseen cooking video annotated with object labels and bounding boxes, and outputs a collaborative manipulation action plan for one or more robotic arms. We demonstrate performance of the system in a standardized dataset of 100 YouTube cooking videos, as well as in six full-length Youtube videos that include collaborative actions between two participants. We compare our system with a baseline system that consists of a state-of-the-art action detection baseline and show our system achieves higher action detection accuracy. We additionally propose an open-source platform for executing the learned plans in a simulation environment as well as with an actual robotic arm.
Adaptation of Autoencoder for Sparsity Reduction From Clinical Notes Representation Learning
Le, Thanh-Dung, Noumeir, Rita, Rambaud, Jerome, Sans, Guillaume, Jouvet, Philippe
When dealing with clinical text classification on a small dataset recent studies have confirmed that a well-tuned multilayer perceptron outperforms other generative classifiers, including deep learning ones. To increase the performance of the neural network classifier, feature selection for the learning representation can effectively be used. However, most feature selection methods only estimate the degree of linear dependency between variables and select the best features based on univariate statistical tests. Furthermore, the sparsity of the feature space involved in the learning representation is ignored. Goal: Our aim is therefore to access an alternative approach to tackle the sparsity by compressing the clinical representation feature space, where limited French clinical notes can also be dealt with effectively. Methods: This study proposed an autoencoder learning algorithm to take advantage of sparsity reduction in clinical note representation. The motivation was to determine how to compress sparse, high-dimensional data by reducing the dimension of the clinical note representation feature space. The classification performance of the classifiers was then evaluated in the trained and compressed feature space. Results: The proposed approach provided overall performance gains of up to 3% for each evaluation. Finally, the classifier achieved a 92% accuracy, 91% recall, 91% precision, and 91% f1-score in detecting the patient's condition. Furthermore, the compression working mechanism and the autoencoder prediction process were demonstrated by applying the theoretic information bottleneck framework.
Out-of-Distribution Detection with Hilbert-Schmidt Independence Optimization
Lin, Jingyang, Wang, Yu, Cai, Qi, Pan, Yingwei, Yao, Ting, Chao, Hongyang, Mei, Tao
Outlier detection tasks have been playing a critical role in AI safety. There has been a great challenge to deal with this task. Observations show that deep neural network classifiers usually tend to incorrectly classify out-of-distribution (OOD) inputs into in-distribution classes with high confidence. Existing works attempt to solve the problem by explicitly imposing uncertainty on classifiers when OOD inputs are exposed to the classifier during training. In this paper, we propose an alternative probabilistic paradigm that is both practically useful and theoretically viable for the OOD detection tasks. Particularly, we impose statistical independence between inlier and outlier data during training, in order to ensure that inlier data reveals little information about OOD data to the deep estimator during training. Specifically, we estimate the statistical dependence between inlier and outlier data through the Hilbert-Schmidt Independence Criterion (HSIC), and we penalize such metric during training. We also associate our approach with a novel statistical test during the inference time coupled with our principled motivation. Empirical results show that our method is effective and robust for OOD detection on various benchmarks. In comparison to SOTA models, our approach achieves significant improvement regarding FPR95, AUROC, and AUPR metrics. Code is available: \url{https://github.com/jylins/hood}.
Identifying Weaknesses in Machine Translation Metrics Through Minimum Bayes Risk Decoding: A Case Study for COMET
Amrhein, Chantal, Sennrich, Rico
Neural metrics have achieved impressive correlation with human judgements in the evaluation of machine translation systems, but before we can safely optimise towards such metrics, we should be aware of (and ideally eliminate) biases toward bad translations that receive high scores. Our experiments show that sample-based Minimum Bayes Risk decoding can be used to explore and quantify such weaknesses. When applying this strategy to COMET for en-de and de-en, we find that COMET models are not sensitive enough to discrepancies in numbers and named entities. We further show that these biases are hard to fully remove by simply training on additional synthetic data and release our code and data for facilitating further experiments.