AITopics

doi: 10.1109/ICAICTA59291.2023.10390334

2309.16742

Country:

Asia > Indonesia > Java > Jakarta > Jakarta (0.06)
North America > United States > California > San Francisco County > San Francisco (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
Asia > Indonesia > Java > West Java > Bandung (0.04)

Genre: Research Report > New Finding (0.47)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.55)

Kaur, Arshmeet, Sarmadi, Morteza

Predicting loss-of-function impact of genetic mutations: a machine learning approach

arXiv.org Artificial IntelligenceJan-26-2024

The innovation of next-generation sequencing (NGS) techniques has significantly reduced the price of genome sequencing, lowering barriers to future medical research; it is now feasible to apply genome sequencing to studies where it would have previously been cost-inefficient. Identifying damaging or pathogenic mutations in vast amounts of complex, high-dimensional genome sequencing data may be of particular interest to researchers. Thus, this paper's aims were to train machine learning models on the attributes of a genetic mutation to predict LoFtool scores (which measure a gene's intolerance to loss-of-function mutations). These attributes included, but were not limited to, the position of a mutation on a chromosome, changes in amino acids, and changes in codons caused by the mutation. Models were built using the univariate feature selection technique f-regression combined with K-nearest neighbors (KNN), Support Vector Machine (SVM), Random Sample Consensus (RANSAC), Decision Trees, Random Forest, and Extreme Gradient Boosting (XGBoost). These models were evaluated using five-fold cross-validated averages of r-squared, mean squared error, root mean squared error, mean absolute error, and explained variance. The findings of this study include the training of multiple models with testing set r-squared values of 0.97.

genetic mutation, loss-of-function impact

2402.00054

Genre: Research Report (0.89)

Industry: Health & Medicine > Therapeutic Area > Genetic Disease (0.60)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.53)

Abdalmalak, Kerlos Atia, Gallardo-Antol'in, Ascensión

Enhancement of a Text-Independent Speaker Verification System by using Feature Combination and Parallel-Structure Classifiers

arXiv.org Artificial IntelligenceJan-26-2024

Speaker Verification (SV) systems involve mainly two individual stages: feature extraction and classification. In this paper, we explore these two modules with the aim of improving the performance of a speaker verification system under noisy conditions. On the one hand, the choice of the most appropriate acoustic features is a crucial factor for performing robust speaker verification. The acoustic parameters used in the proposed system are: Mel Frequency Cepstral Coefficients (MFCC), their first and second derivatives (Deltas and Delta- Deltas), Bark Frequency Cepstral Coefficients (BFCC), Perceptual Linear Predictive (PLP), and Relative Spectral Transform - Perceptual Linear Predictive (RASTA-PLP). In this paper, a complete comparison of different combinations of the previous features is discussed. On the other hand, the major weakness of a conventional Support Vector Machine (SVM) classifier is the use of generic traditional kernel functions to compute the distances among data points. However, the kernel function of an SVM has great influence on its performance. In this work, we propose the combination of two SVM-based classifiers with different kernel functions: Linear kernel and Gaussian Radial Basis Function (RBF) kernel with a Logistic Regression (LR) classifier. The combination is carried out by means of a parallel structure approach, in which different voting rules to take the final decision are considered. Results show that significant improvement in the performance of the SV system is achieved by using the combined features with the combined classifiers either with clean speech or in the presence of noise. Finally, to enhance the system more in noisy environments, the inclusion of the multiband noise removal technique as a preprocessing stage is proposed.

enhancement, feature combination and parallel-structure classifier, text-independent speaker verification system

doi: 10.1007/s00521-016-2470-x

2401.15018

Genre: Research Report > New Finding (0.53)

Industry: Information Technology > Security & Privacy (0.60)

Technology:

Information Technology > Artificial Intelligence > Speech (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.87)

Taheri, Tayebeh, Aghaei, Alireza Afzal, Parand, Kourosh

An Orthogonal Polynomial Kernel-Based Machine Learning Model for Differential-Algebraic Equations

A system of differential-algebraic equations (DAEs) is a combination of differential equations and algebraic equations, in which the differential equations are related to the dynamical evolution of the system, and the algebraic equations are responsible for constraining the solutions that satisfy the differential and algebraic equations. DAEs serve as essential models for a wide array of physical phenomena. They find applications across various domains such as mechanical systems, electrical circuit simulations, chemical process modeling, dynamic system control, biological simulations, and control systems. Consequently, solving these intricate differential equations has remained a significant challenge for researchers. To address this, a range of techniques including numerical, analytical, and semi-analytical methods have been employed to tackle the complexities inherent in solving DAEs.

differential equation, differential-algebraic equation, equation, (15 more...)

2401.14382

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
Africa > Cameroon > Littoral Region > Douala (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)

Kimera, Richard, Rim, Daniela N., Kirabira, Joseph, Udomah, Ubong Godwin, Choi, Heeyoul

Enhanced Labeling Technique for Reddit Text and Fine-Tuned Longformer Models for Classifying Depression Severity in English and Luganda

Depression is a global burden and one of the most challenging mental health conditions to control. Experts can detect its severity early using the Beck Depression Inventory (BDI) questionnaire, administer appropriate medication to patients, and impede its progression. Due to the fear of potential stigmatization, many patients turn to social media platforms like Reddit for advice and assistance at various stages of their journey. This research extracts text from Reddit to facilitate the diagnostic process. It employs a proposed labeling approach to categorize the text and subsequently fine-tunes the Longformer model. The model's performance is compared against baseline models, including Naive Bayes, Random Forest, Support Vector Machines, and Gradient Boosting. Our findings reveal that the Longformer model outperforms the baseline models in both English (48%) and Luganda (45%) languages on a custom-made dataset.

dataset, depression, longformer model, (9 more...)

doi: 10.1109/ICTC58733.2023.10393433

2401.1424

Country:

Asia > South Korea > Gyeongsangbuk-do > Pohang (0.05)
North America > United States > New York (0.04)
Europe (0.04)
(2 more...)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Consumer Health (1.00)
Media > News (0.94)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Malik, Hasmat, Yadav, Amit Kumar, Márquez, Fausto Pedro García, Pinar-Pérez, Jesús María

Novel application of Relief Algorithm in cascaded artificial neural network to predict wind speed for wind power resource assessment in India

Wind power generated by wind has non-schedule nature due to stochastic nature of meteorological variable. Hence energy business and control of wind power generation requires prediction of wind speed (WS) from few seconds to different time steps in advance. To deal with prediction shortcomings, various WS prediction methods have been used. Predictive data mining offers variety of methods for WS predictions where artificial neural network (ANN) is one of the reliable and accurate methods. It is observed from the result of this study that ANN gives better accuracy in comparison conventional model. The accuracy of WS prediction models is found to be dependent on input parameters and architecture type algorithms utilized. So the selection of most relevant input parameters is important research area in WS predicton field. The objective of the paper is twofold: first extensive review of ANN for wind power and WS prediction is carried out. Discussion and analysis of feature selection using Relief Algorithm (RA) in WS prediction are considered for different Indian sites. RA identify atmospheric pressure, solar radiation and relative humidity are relevant input variables. Based on relevant input variables Cascade ANN model is developed and prediction accuracy is evaluated. It is found that root mean square error (RMSE) for comparison between predicted and measured WS for training and testing wind speed are found to be 1.44 m/s and 1.49 m/s respectively. The developed cascade ANN model can be used to predict wind speed for sites where there are not WS measuring instruments are installed in India.

neural network, prediction, wind speed, (14 more...)

doi: 10.1016/j.esr.2022.100864

2401.14065

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.06)
Europe > Spain > Galicia > Madrid (0.04)
Europe > Spain > Castilla-La Mancha > Ciudad Real Province > Ciudad Real (0.04)
(20 more...)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Renewable > Wind (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)

Ghosh, Tanmay, Nagaraj, Nithin

Evaluating the Determinants of Mode Choice Using Statistical and Machine Learning Techniques in the Indian Megacity of Bengaluru

The decision making involved behind the mode choice is critical for transportation planning. While statistical learning techniques like discrete choice models have been used traditionally, machine learning (ML) models have gained traction recently among the transportation planners due to their higher predictive performance. However, the black box nature of ML models pose significant interpretability challenges, limiting their practical application in decision and policy making. This study utilised a dataset of $1350$ households belonging to low and low-middle income bracket in the city of Bengaluru to investigate mode choice decision making behaviour using Multinomial logit model and ML classifiers like decision trees, random forests, extreme gradient boosting and support vector machines. In terms of accuracy, random forest model performed the best ($0.788$ on training data and $0.605$ on testing data) compared to all the other models. This research has adopted modern interpretability techniques like feature importance and individual conditional expectation plots to explain the decision making behaviour using ML models. A higher travel costs significantly reduce the predicted probability of bus usage compared to other modes (a $0.66\%$ and $0.34\%$ reduction using Random Forests and XGBoost model for $10\%$ increase in travel cost). However, reducing travel time by $10\%$ increases the preference for the metro ($0.16\%$ in Random Forests and 0.42% in XGBoost). This research augments the ongoing research on mode choice analysis using machine learning techniques, which would help in improving the understanding of the performance of these models with real-world data in terms of both accuracy and interpretability.

dataset, probability, travel time, (16 more...)

2401.13977

Country:

Asia > India > Karnataka > Bengaluru (0.70)
North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Health & Medicine (0.93)
Government (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.86)

del Río, Tereso, England, Matthew

Lessons on Datasets and Paradigms in Machine Learning for Symbolic Computation: A Case Study on CAD

arXiv.org Artificial IntelligenceJan-24-2024

Symbolic Computation algorithms and their implementation in computer algebra systems often contain choices which do not affect the correctness of the output but can significantly impact the resources required: such choices can benefit from having them made separately for each problem via a machine learning model. This study reports lessons on such use of machine learning in symbolic computation, in particular on the importance of analysing datasets prior to machine learning and on the different machine learning paradigms that may be utilised. We present results for a particular case study, the selection of variable ordering for cylindrical algebraic decomposition, but expect that the lessons learned are applicable to other decisions in symbolic computation. We utilise an existing dataset of examples derived from applications which was found to be imbalanced with respect to the variable ordering decision. We introduce an augmentation technique for polynomial systems problems that allows us to balance and further augment the dataset, improving the machine learning results by 28\% and 38\% on average, respectively. We then demonstrate how the existing machine learning methodology used for the problem $-$ classification $-$ might be recast into the regression paradigm. While this does not have a radical change on the performance, it does widen the scope in which the methodology can be applied to make choices.

dataset, paradigm, polynomial, (11 more...)

2401.13343

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > United Kingdom > England (0.06)
Europe > Switzerland (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.46)

Zhou, Emily, Soleymani, Mohammad, Matarić, Maja J.

Investigating the Generalizability of Physiological Characteristics of Anxiety

arXiv.org Artificial IntelligenceJan-23-2024

Recent works have demonstrated the effectiveness of machine learning (ML) techniques in detecting anxiety and stress using physiological signals, but it is unclear whether ML models are learning physiological features specific to stress. To address this ambiguity, we evaluated the generalizability of physiological features that have been shown to be correlated with anxiety and stress to high-arousal emotions. Specifically, we examine features extracted from electrocardiogram (ECG) and electrodermal (EDA) signals from the following three datasets: Anxiety Phases Dataset (APD), Wearable Stress and Affect Detection (WESAD), and the Continuously Annotated Signals of Emotion (CASE) dataset. We aim to understand whether these features are specific to anxiety or general to other high-arousal emotions through a statistical regression analysis, in addition to a within-corpus, cross-corpus, and leave-one-corpus-out cross-validation across instances of stress and arousal. We used the following classifiers: Support Vector Machines, LightGBM, Random Forest, XGBoost, and an ensemble of the aforementioned models. We found that models trained on an arousal dataset perform relatively well on a previously unseen stress dataset, and vice versa. Our experimental results suggest that the evaluated models may be identifying emotional arousal instead of stress. This work is the first cross-corpus evaluation across stress and arousal from ECG and EDA signals, contributing new findings about the generalizability of stress detection.

arousal, dataset, detection, (16 more...)

doi: 10.1109/BIBM58861.2023.10385292

2402.15513

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (0.89)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

arXiv.org Artificial IntelligenceJan-23-2024

Performance Analysis of Support Vector Machine (SVM) on Challenging Datasets for Forest Fire Detection

Kar, Ankan, Nath, Nirjhar, Kemprai, Utpalraj, Aman, null

This article delves into the analysis of performance and utilization of Support Vector Machines (SVMs) for the critical task of forest fire detection using image datasets. With the increasing threat of forest fires to ecosystems and human settlements, the need for rapid and accurate detection systems is of utmost importance. SVMs, renowned for their strong classification capabilities, exhibit proficiency in recognizing patterns associated with fire within images. By training on labeled data, SVMs acquire the ability to identify distinctive attributes associated with fire, such as flames, smoke, or alterations in the visual characteristics of the forest area. The document thoroughly examines the use of SVMs, covering crucial elements like data preprocessing, feature extraction, and model training. It rigorously evaluates parameters such as accuracy, efficiency, and practical applicability. The knowledge gained from this study aids in the development of efficient forest fire detection systems, enabling prompt responses and improving disaster management. Moreover, the correlation between SVM accuracy and the difficulties presented by high-dimensional datasets is carefully investigated, demonstrated through a revealing case study. The relationship between accuracy scores and the different resolutions used for resizing the training datasets has also been discussed in this article. These comprehensive studies result in a definitive overview of the difficulties faced and the potential sectors requiring further improvement and focus.

dataset, support vector machine, svm, (12 more...)

2401.12924

Country: Asia > India > Tamil Nadu > Chennai (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)