AITopics

2208.00565

Country: North America > United States > Maryland > Baltimore (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (0.94)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.70)

Zanartu, Francisco, Treude, Christoph, Cartaxo, Bruno, Borges, Hudson Silva, Moura, Pedro, Wagner, Markus, Pinto, Gustavo

Automatically Categorising GitHub Repositories by Application Domain

arXiv.org Artificial IntelligenceJul-30-2022

For example, there are limited means available to separate repositories containing engineered software projects from other repositories, such as personal projects or those that use GitHub for free cloud storage (Kalliamvakou et al., 2014; Munaiah et al., 2017). To make it easier for users to identify relevant repositories for their wide variety of use cases, GitHub has been adding features to its service, such as README files, topics tags, and showcases (where contributors describe, add keywords, and label their repository). However, these features are insufficient for many use cases. For example, while achieving generalizability of the results is the primary objective of many empirical papers, modern computing research is largely application domain independent (Capiluppi et al., 2020). Application domains are the sections of reality for which a software system is designed. Their importance relies on their serving as the starting point for actual state analysis and usually includes domain-specific language, meaning that developers in this domain think about their project in a specific way, with particular terms and concepts (Züllighoven, 2004). Application domains are not a feature currently implemented by GitHub to catalogue repositories. Previous work has found that repository quality indicators, such as object-oriented metrics, can be "extremely sensitive to application domains" (Capiluppi and Ajienka, 2019), and that the application domain is an important factor in predicting repository popularity (Borges et al., 2016). Furthermore, since documentation of GitHub repositories is often incomplete (Prana et al., 2019), information about the application domain of a repository can be crucial to gain a high-level understanding of its content and purpose.

machine learning, natural language, programming language, (17 more...)

2208.00269

Country:

South America > Brazil > Pernambuco (0.04)
Oceania > Australia > South Australia > Adelaide (0.04)
South America > Brazil > Pará (0.04)
(4 more...)

Genre: Research Report > New Finding (0.94)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

arXiv.org Artificial IntelligenceJul-30-2022

PrePARE: Predictive Proprioception for Agile Failure Event Detection in Robotic Exploration of Extreme Terrains

Dey, Sharmita, Fan, David, Schmid, Robin, Dixit, Anushri, Otsu, Kyohei, Touma, Thomas, Schilling, Arndt F., Agha-mohammadi, Ali-akbar

Legged robots can traverse a wide variety of terrains, some of which may be challenging for wheeled robots, such as stairs or highly uneven surfaces. However, quadruped robots face stability challenges on slippery surfaces. This can be resolved by adjusting the robot's locomotion by switching to more conservative and stable locomotion modes, such as crawl mode (where three feet are in contact with the ground always) or amble mode (where one foot touches down at a time) to prevent potential falls. To tackle these challenges, we propose an approach to learn a model from past robot experience for predictive detection of potential failures. Accordingly, we trigger gait switching merely based on proprioceptive sensory information. To learn this predictive model, we propose a semi-supervised process for detecting and annotating ground truth slip events in two stages: We first detect abnormal occurrences in the time series sequences of the gait data using an unsupervised anomaly detector, and then, the anomalies are verified with expert human knowledge in a replay simulation to assert the event of a slip. These annotated slip events are then used as ground truth examples to train an ensemble decision learner for predicting slip probabilities across terrains for traversability. We analyze our model on data recorded by a legged robot on multiple sites with slippery terrain. We demonstrate that a potential slip event can be predicted up to 720 ms ahead of a potential fall with an average precision greater than 0.95 and an average F-score of 0.82. Finally, we validate our approach in real-time by deploying it on a legged robot and switching its gait mode based on slip event detection.

information, robot, slip event, (17 more...)

2208.00322

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Los Angeles County > Pasadena (0.05)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Locomotion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Chowdhury, Mohammed Nowshad Ruhani, Zhang, Wandong, Akilan, Thangarajah

ANOVA-based Automatic Attribute Selection and a Predictive Model for Heart Disease Prognosis

arXiv.org Artificial IntelligenceJul-30-2022

Studies show that Studies that cardiovascular diseases (CVDs) are malignant for human health. Thus, it is important to have an efficient way of CVD prognosis. In response to this, the healthcare industry has adopted machine learning-based smart solutions to alleviate the manual process of CVD prognosis. Thus, this work proposes an information fusion technique that combines key attributes of a person through analysis of variance (ANOVA) and domain experts' knowledge. It also introduces a new collection of CVD data samples for emerging research. There are thirty-eight experiments conducted exhaustively to verify the performance of the proposed framework on four publicly available benchmark datasets and the newly created dataset in this work. The ablation study shows that the proposed approach can achieve a competitive mean average accuracy (mAA) of 99.2% and a mean average AUC of 97.9%.

classifier, dataset, feature selection, (16 more...)

2208.00296

Country:

Europe > Switzerland (0.06)
Asia > Bangladesh > Sylhet Division > Sylhet District > Sylhet (0.04)
North America > Canada > Ontario > Thunder Bay (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
(4 more...)

#artificialintelligenceJul-29-2022, 22:50:12 GMT

June 2022: "Top 40" New CRAN Packages

One hundred eighty-nine new packages made it to CRAN in June. Here are my “Top 40” selections in eleven categories: Computational Methods, Data, Ecology, Genomics, Machine Learning, Mathematics, Medicine, Statistics, Time Series, Utilities, and Visualizations. Computational Methods itp v1.2.0: Implements the interpolate, truncate, project root-finding algorithm developed by Oliveira & Takahashi (2021). The vignette provides an overview. QR v0..1.3: Provides a function to perform QR factorization without pivoting to a real or complex matrix. It is based on LAPACK. See the vignette. qsplines v1.0.0: Provides functions to create quaterion splines. See Barry & Goldman (1988) and Kochanek & Bartels (1984) for the details and look here for an example. VMDecomp v1.0.1: Implements the variational mode decomposition and two-dimensional variational mode decomposition algorithm. See Dragomiretskiy & Zosso (2014) for background and the vignette for examples. Data cmch v0.2.0: Implements a wrapper around the Canadian Mortgage and Housing Corporation web interface and enables programmatic and reproducible access to a wide variety of housing data. See the vignette for examples. EDIutils v1.0.1: Implements a client for the Environmental Data Initiative repository REST API and provides access to ecological data and metadata. There are five short vignettes: Evaluate & upload, Citation Metrics, Download Metrics, Search andaccess, and Tests. globaltrends v0.0.12: Provides functions to access global search volumes from the Google Trends portal. This working paper outlines the package’s methodological foundations and potential applications. See the vignette to get started. kaigiroku v0.5: Allows users to search and download data from the API for Japanese Diet proceedings. Look here for examples. NasdaqDataLink v1.0.0: Provides functions to interact directly with the Nasdaq Data Link API and obtain data in a number of formats. Look here for API documentation and here for package information. stortingscrape v0.1.1: Provides functions for retrieving data from the Norwegian Parliament, through the Norwegian Parliament API. See the vingette for an introduction. Ecology PointedSDMs v1.0.6: Provides tools to build integrated species distribution models and includes tools to run spatial cross-validation and plotting. See Issac et al. (2020) for and introduction to the methods. There is a Setophaga Example and an example for the Solitary Tinamou. restoptr v1.0.1: Implements a flexible framework for ecological restoration planning that aims to identify priority areas for restoration efforts using optimization algorithms described in Justeau-Allaire et al. 2021. See the vignette to get started. Genomics scapGNN v0.1.1: Implements a single cell active pathway analysis tool based on the graph neural network algorithm described in Scarselli et al. (2009) and Kipf & Welling (2017). This may be used to construct a gene-cell association network, infer pathway activity scores from different single cell modalities data and more. See the vignette for an overview and examples. SRTsim v0.99.2: Implements an independent, reproducible, and flexible Spatially Resolved Transcriptomics simulation framework that can be used to facilitate the development analytical methods and for a wide variety of SRT-specific analyses. See the vignette. xQTLbiolinks v1.1.1: Implements tools to query, download, and visualize of molecular quantitative trait locus and gene expression data from public resources through the GTEx API. There is a Quick Start Guide and vignettes on Colocalization, Specivicity, and Visualization. Machine Learning agua v0.0.1: Enables users to specify h2o as an engine for several tidymodels modeling methods. See README for examples. MagmaClustR V1.0.0: Implements two main algorithms, called Magma (Leroy et al. (2022) and MagmaClust (Leroy et al. (2020)), using a multi-task Gaussian processes (GP) model to perform predictions for supervised learning problems. See README for examples. openai v0.1.0: Provides a wrapper for OpenAI API endpoints including engines, completions, edits, files, fine-tunes, embeddings and legacy searches, classifications, and answers endpoints. See README to get started. sketching v0.1.0: Provides functions to construct sketches of data via random subspace embeddings. See Lee & Ng (2022) for the theory and the vignette for examples. webmorphR v0..1.1: Provides functions to create reproducible image stimuli, specialised for face images with psychomorph or webmorph templates. See README to get started. Mathematics GeneralizedWendland v0.5-2: Implements the fully parameterized generalized Wendland covariance function for use in Gaussian process models, as well as multiple methods for approximating it via covariance interpolation. The available methods are linear interpolation, polynomial interpolation, and cubic spline interpolation. See Bevilacqua et al. (2022) and the vignette for examples. jacobi v2.0.0: Evaluates Jacobi theta functions and related functions including the Weierstrass elliptic function, the Weierstrass sigma function, the Weierstrass zeta function, the Klein j-function, the Dedekind eta function, the lambda modular function, Jacobi elliptic functions, Neville theta functions, and the Eisenstein series for real and complex variables. Look here for some images. Medicine clinicalsignificance v1.0.0: Implements the clinical significance algorithm proposed by Jacobson et al. (1984) to determine if an intervention has a meaningful practical effect. There is a Getting Started Guide and vignettes on Cutoffs and Plots. PlatformDesign v1.0.1: Provides functions to calculate design parameters for an optimal two-period, multi-arm platform design allowing pre-planned deferred arms to be added during the trial. See Dunnett (1955) for background and the vignette for some theory and examples. Statistics bayesassurance v0.1.0: Provides functions to compute Bayesian assurance under various settings characterized by different assumptions and objectives, including precision-based conditions, credible intervals, and goal functions. See Pan & Banerjee (2021) for the theory. There are vignettes for using closed form solutions, the conjugate linear model, and precision based conditions. DSSP v0.1.1: Provides functions to draw samples from the direct sampling spatial prior model as described in White, Sun, & Speckman (2019). See the vignette for examples. edibble v0.1.0: Implements a system to facilitate designing comparative experiments using the grammar of experimental designs. See the edibble-book for documentation. mixgb v0.1.0: Implements a method for multiple imputation using XGBoost, bootstrapping and predictive mean matching as described in Deng and Lumley (2021). There is an Introduction and a vignette on Imputing new data with a saved imputer. outerbase v0.1.0: Implements in new method for high-dimensional regression using outer product models. See Plumlee (2014) and Plumlee et al. (2021) for background. There is a Getting started guide, a Base walkthrough, and vignettes on Learning from data and Speeding up inference. PFIM v5.0: Provides functions to evaluate or optimize designs for nonlinear mixed effects models using the Fisher Information matrix. See Malle & Baccar D (1997) and Retout et al. (2007) for background and the vignettes Design evaluation and optimixation (01), Design evaluation and optimixation (02), and Library of models for examples. VirtualPop v1.0.2: Provides functions to generate lifespans and fertility histories in continuous time using individual-level state transition (multi-state) models and data. See the vignettes on Simulation of life histories, Sampling from waiting time distributions, Simulation of individual fertility careers, and Validation. Time Series kssa v0.0.1: Implements the known sub-sequence algorithm described in Benavides et al. (2022), which helps to automatically identify and validate the best method for missing data imputation in a time series. Look here for examples. ts2net v0.1.0: Implements methods to transform time series into networks, a technique which may be useful for complex systems modeling, time series data mining, or time series analysis using networks. For an introduction to the topic and descriptions of the methods see Mitchell (2006), Silva & Zhao (2016), and Silva et al. (2021). See README to get started. Utilities cppchedkR Allows users to run Cppcheck on C/C++ files as an R command or an RStudio addin. See README. . gtExtras v0.4.1: Provides additional functions for creating tables with gt. See README for examples. . Visualization ggpie v0.2.2: Provides functions for creating pie, donut and rose pie plots with ggplot2. See the vignette. ggtrace v0.2.0: Provides ggplot2 geoms that allow groups of data points to be outlined or highlighted for emphasis. See the vignettes Trace lines and Trace points. Morphoscape v1.0.0: Implements adaptive landscape methods first described by Polly et al. (2016) for the integration, analysis and visualization of biological trait data on a phenotypic morphospace which are typically defined by shape metrics. See the vignette. r3js v0.0.1: Provides R and JavaScript functions to allow WebGL-based 3D plotting using the three.js library. See the vignettes: Getting Started, Creating a plot from scratch, and Grouping plot elements. rgl2gltf v1.0.0: Provides functions to work with glTF files which are used to describe 3D models. See the vignette for examples.. . shapviz v0.2.0: Provides functions to visualize SHapley Additive exPlanations (SHAP), such as waterfall plots, force plots, various types of importance plots, and dependence plots. See Lundberg & Lee (2017) for background and the vignette for examples.

algorithm, provide function, vignette, (13 more...)

#artificialintelligence

Genre: Overview (0.75)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.75)
Education (0.55)
Banking & Finance (0.55)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.55)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.45)
(2 more...)

Mandal, Nibir Chandra, Shahariar, G. M., Shawon, Md. Tanvir Rouf

Effectiveness of Transformer Models on IoT Security Detection in StackOverflow Discussions

The Internet of Things (IoT) is an emerging concept that directly links to the billions of physical items, or "things", that are connected to the Internet and are all gathering and exchanging information between devices and systems. However, IoT devices were not built with security in mind, which might lead to security vulnerabilities in a multi-device system. Traditionally, we investigated IoT issues by polling IoT developers and specialists. This technique, however, is not scalable since surveying all IoT developers is not feasible. Another way to look into IoT issues is to look at IoT developer discussions on major online development forums like Stack Overflow (SO). However, finding discussions that are relevant to IoT issues is challenging since they are frequently not categorized with IoT-related terms. In this paper, we present the "IoT Security Dataset", a domain-specific dataset of 7147 samples focused solely on IoT security discussions. As there are no automated tools to label these samples, we manually labeled them. We further employed multiple transformer models to automatically detect security discussions. Through rigorous investigations, we found that IoT security discussions are different and more complex than traditional security discussions. We demonstrated a considerable performance loss (up to 44%) of transformer models on cross-domain datasets when we transferred knowledge from a general-purpose dataset "Opiner", supporting our claim. Thus, we built a domain-specific IoT security detector with an F1-Score of 0.69. We have made the dataset public in the hope that developers would learn more about the security discussion and vendors would enhance their concerns about product security.

artificial intelligence, machine learning, natural language, (17 more...)

doi: 10.1007/978-981-19-7528-8_10

2207.14542

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Bangladesh (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)

Meta-TTS: Meta-Learning for Few-Shot Speaker Adaptive Text-to-Speech

Huang, Sung-Feng, Lin, Chyi-Jiunn, Liu, Da-Rong, Chen, Yi-Chen, Lee, Hung-yi

Personalizing a speech synthesis system is a highly desired application, where the system can generate speech with the user's voice with rare enrolled recordings. There are two main approaches to build such a system in recent works: speaker adaptation and speaker encoding. On the one hand, speaker adaptation methods fine-tune a trained multi-speaker text-to-speech (TTS) model with few enrolled samples. However, they require at least thousands of fine-tuning steps for high-quality adaptation, making it hard to apply on devices. On the other hand, speaker encoding methods encode enrollment utterances into a speaker embedding. The trained TTS model can synthesize the user's speech conditioned on the corresponding speaker embedding. Nevertheless, the speaker encoder suffers from the generalization gap between the seen and unseen speakers. In this paper, we propose applying a meta-learning algorithm to the speaker adaptation method. More specifically, we use Model Agnostic Meta-Learning (MAML) as the training algorithm of a multi-speaker TTS model, which aims to find a great meta-initialization to adapt the model to any few-shot speaker adaptation tasks quickly. Therefore, we can also adapt the meta-trained TTS model to unseen speakers efficiently. Our experiments compare the proposed method (Meta-TTS) with two baselines: a speaker adaptation method baseline and a speaker encoding method baseline. The evaluation results show that Meta-TTS can synthesize high speaker-similarity speech from few enrollment samples with fewer adaptation steps than the speaker adaptation baseline and outperforms the speaker encoding baseline under the same training scheme. When the speaker encoder of the baseline is pre-trained with extra 8371 speakers of data, Meta-TTS can still outperform the baseline on LibriTTS dataset and achieve comparable results on VCTK dataset.

artificial intelligence, machine learning, utterance, (17 more...)

doi: 10.1109/TASLP.2022.3167258

2111.0404

Country: Asia > Taiwan > Taiwan Province > Taipei (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Speech > Speech Synthesis (0.91)
Information Technology > Artificial Intelligence > Vision > Optical Character Recognition (0.61)

Aoudia, Fayçal Aït, Hoydis, Jakob, Cammerer, Sebastian, Van Keirsbilck, Matthijs, Keller, Alexander

Deep Learning-Based Synchronization for Uplink NB-IoT

We propose a neural network (NN)-based algorithm for device detection and time of arrival (ToA) and carrier frequency offset (CFO) estimation for the narrowband physical random-access channel (NPRACH) of narrowband internet of things (NB-IoT). The introduced NN architecture leverages residual convolutional networks as well as knowledge of the preamble structure of the 5G New Radio (5G NR) specifications. Benchmarking on a 3rd Generation Partnership Project (3GPP) urban microcell (UMi) channel model with random drops of users against a state-of-the-art baseline shows that the proposed method enables up to 8 dB gains in false negative rate (FNR) as well as significant gains in false positive rate (FPR) and ToA and CFO estimation accuracy. Moreover, our simulations indicate that the proposed algorithm enables gains over a wide range of channel conditions, CFOs, and transmission probabilities. The introduced synchronization method operates at the base station (BS) and, therefore, introduces no additional complexity on the user devices. It could lead to an extension of battery lifetime by reducing the preamble length or the transmit power. Our code is available at: https://github.com/NVlabs/nprach_synch/.

cfo, preamble, probability, (16 more...)

2205.10805

Genre: Research Report (0.40)

Industry:

Telecommunications (0.35)
Information Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Xu, Yanhua, Wojtczak, Dominik

Multi-channel neural networks for predicting influenza A virus hosts and antigenic types

Influenza occurs every season and occasionally causes pandemics. Despite its low mortality rate, influenza is a major public health concern, as it can be complicated by severe diseases like pneumonia. A fast, accurate and low-cost method to predict the origin host and subtype of influenza viruses could help reduce virus transmission and benefit resource-poor areas. In this work, we propose multi-channel neural networks to predict antigenic types and hosts of influenza A viruses with hemagglutinin and neuraminidase protein sequences. An integrated data set containing complete protein sequences were used to produce a pre-trained model, and two other data sets were used for testing the model's performance. One test set contained complete protein sequences, and another test set contained incomplete protein sequences. The results suggest that multi-channel neural networks are applicable and promising for predicting influenza A virus hosts and antigenic subtypes with complete and partial protein sequences.

neural network, sequence, virus, (14 more...)

2206.03823

Country:

Asia > Middle East > Republic of Türkiye (0.05)
Europe > Russia (0.04)
Asia > Russia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.88)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Y, Meghanath Macha, Ravindran, Sriram, Pai, Deepak, Narang, Anish, Srivastava, Vijay

Multiple Attribute Fairness: Application to Fraud Detection

arXiv.org Artificial IntelligenceJul-28-2022

We propose a fairness measure relaxing the equality conditions in the popular equal odds fairness regime for classification. We design an iterative, model-agnostic, grid-based heuristic that calibrates the outcomes per sensitive attribute value to conform to the measure. The heuristic is designed to handle high arity attribute values and performs a per attribute sanitization of outcomes across different protected attribute values. We also extend our heuristic for multiple attributes. Highlighting our motivating application, fraud detection, we show that the proposed heuristic is able to achieve fairness across multiple values of a single protected attribute, multiple protected attributes. When compared to current fairness techniques, that focus on two groups, we achieve comparable performance across several public data sets.

artificial intelligence, fairness measure, machine learning, (13 more...)

2207.14355

Country:

North America > United States > District of Columbia > Washington (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Law Enforcement & Public Safety > Fraud (0.89)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Data Science (0.96)