AITopics

Rego, Rosana C. B., Silva, Verônica M. L.

Predicting gender of Brazilian names using deep learning

Predicting gender by the name is not a simple task. In many applications, especially in the natural language processing (NLP) field, this task may be necessary, mainly when considering foreign names. Some machine learning algorithms can satisfactorily perform the prediction. In this paper, we examined and implemented feedforward and recurrent deep neural network models, such as MLP, RNN, GRU, CNN, and BiLSTM, to classify gender through the first name. A dataset of Brazilian names is used to train and evaluate the models. We analyzed the accuracy, recall, precision, and confusion matrix to measure the models' performances. The results indicate that the gender prediction can be performed from the feature extraction strategy looking at the names as a set of strings. Some models accurately predict the gender in more than 90% of the cases. The recurrent models overcome the feedforward models in this binary classification problem.

artificial intelligence, gender, machine learning, (16 more...)

2106.10156

Country:

Asia > Taiwan (0.04)
South America > Brazil > Rio Grande do Norte (0.04)
Asia > India (0.04)

Genre: Research Report (0.67)

Industry: Information Technology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Moreo, Alejandro, Esuli, Andrea, Sebastiani, Fabrizio

QuaPy: A Python-Based Framework for Quantification

QuaPy is an open-source framework for performing quantification (a.k.a. supervised prevalence estimation), written in Python. Quantification is the task of training quantifiers via supervised learning, where a quantifier is a predictor that estimates the relative frequencies (a.k.a. prevalence values) of the classes of interest in a sample of unlabelled data. While quantification can be trivially performed by applying a standard classifier to each unlabelled data item and counting how many data items have been assigned to each class, it has been shown that this "classify and count" method is outperformed by methods specifically designed for quantification. QuaPy provides implementations of a number of baseline methods and advanced quantification methods, of routines for quantification-oriented model selection, of several broadly accepted evaluation measures, and of robust evaluation protocols routinely used in the field. QuaPy also makes available datasets commonly used for testing quantifiers, and offers visualization tools for facilitating the analysis and interpretation of the results. The software is open-source and publicly available under a BSD-3 licence via https://github.com/HLT-ISTI/QuaPy, and can be installed via pip (https://pypi.org/project/QuaPy/)

dataset, prevalence value, quantifier, (14 more...)

2106.11057

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Italy > Tuscany > Pisa Province > Pisa (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Sebastianelli, Alessandro, Del Rosso, Maria Pia, Mathieu, Pierre Philippe, Ullo, Silvia Liberata

Paradigm selection for Data Fusion of SAR and Multispectral Sentinel data applied to Land-Cover Classification

Data fusion is a well-known technique, becoming more and more popular in the Artificial Intelligence for Earth Observation (AI4EO) domain mainly due to its ability of reinforcing AI4EO applications by combining multiple data sources and thus bringing better results. On the other hand, like other methods for satellite data analysis, data fusion itself is also benefiting and evolving thanks to the integration of Artificial Intelligence (AI). In this letter, four data fusion paradigms, based on Convolutional Neural Networks (CNNs), are analyzed and implemented. The goals are to provide a systematic procedure for choosing the best data fusion framework, resulting in the best classification results, once the basic structure for the CNN has been defined, and to help interested researchers in their work when data fusion applied to remote sensing is involved. The procedure has been validated for land-cover classification but it can be transferred to other cases.

classification, fusion, paradigm, (13 more...)

2106.11056

Country:

Europe > Italy (0.04)
Asia > Middle East > Republic of Türkiye (0.04)
Africa > Senegal (0.04)

Genre: Research Report (0.50)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.36)

Technology:

Information Technology > Data Science > Data Integration (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.97)

Nigus, Mersha, Dorsewamy, null

Performance Evaluation of Classification Models for Household Income, Consumption and Expenditure Data Set

Food security is more prominent on the policy agenda today than it has been in the past, thanks to recent food shortages at both the regional and global levels as well as renewed promises from major donor countries to combat chronic hunger. One field where machine learning can be used is in the classification of household food insecurity. In this study, we establish a robust methodology to categorize whether or not a household is being food secure and food insecure by machine learning algorithms. In this study, we have used ten machine learning algorithms to classify the food security status of the Household. Gradient Boosting (GB), Random Forest (RF), Extra Tree (ET), Bagging, K-Nearest Neighbor (KNN), Decision Tree (DT), Support Vector Machine (SVM), Logistic Regression (LR), Ada Boost (AB) and Naive Bayes were the classification algorithms used throughout this study (NB). Then, we perform classification tasks from developing data set for household food security status by gathering data from HICE survey data and validating it by Domain Experts. The performance of all classifiers has better results for all performance metrics. The performance of the Random Forest and Gradient Boosting models are outstanding with a testing accuracy of 0.9997 and the other classifier such as Bagging, Decision tree, Ada Boost, Extra tree, K-nearest neighbor, Logistic Regression, SVM and Naive Bayes are scored 0.9996, 0.09996, 0.9994, 0.95675, 0.9415, 0.8915, 0.7853 and 0.7595, respectively.

algorithm, classification, classifier, (15 more...)

2106.11055

Country:

Africa > Uganda (0.05)
Asia > India > Karnataka (0.04)
Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Food & Agriculture (0.91)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)

arXiv.org Artificial IntelligenceJun-17-2021

On the Role of Entropy-based Loss for Learning Causal Structures with Continuous Optimization

Cai, Ruichu, Chen, Weilin, Qiao, Jie, Hao, Zhifeng

Causal discovery from observational data is an important but challenging task in many scientific fields. Recently, NOTEARS [Zheng et al., 2018] formulates the causal structure learning problem as a continuous optimization problem using least-square loss with an acyclicity constraint. Though the least-square loss function is well justified under the standard Gaussian noise assumption, it is limited if the assumption does not hold. In this work, we theoretically show that the violation of the Gaussian noise assumption will hinder the causal direction identification, making the causal orientation fully determined by the causal strength as well as the variances of noises in the linear case and the noises of strong non-Gaussianity in the nonlinear case. Consequently, we propose a more general entropy-based loss that is theoretically consistent with the likelihood score under any noise distribution. We run extensive empirical evaluations on both synthetic data and real-world data to validate the effectiveness of the proposed method and show that our method achieves the best in Structure Hamming Distance, False Discovery Rate, and True Positive Rate matrices.

additive noise model, causal direction, least-square loss, (10 more...)

2106.02835

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Quebec > Montreal (0.04)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

arXiv.org Machine LearningJun-17-2021

Importance measures derived from random forests: characterisation and extension

Sutera, Antonio

Nowadays new technologies, and especially artificial intelligence, are more and more established in our society. Big data analysis and machine learning, two sub-fields of artificial intelligence, are at the core of many recent breakthroughs in many application fields (e.g., medicine, communication, finance, ...), including some that are strongly related to our day-to-day life (e.g., social networks, computers, smartphones, ...). In machine learning, significant improvements are usually achieved at the price of an increasing computational complexity and thanks to bigger datasets. Currently, cutting-edge models built by the most advanced machine learning algorithms typically became simultaneously very efficient and profitable but also extremely complex. Their complexity is to such an extent that these models are commonly seen as black-boxes providing a prediction or a decision which can not be interpreted or justified. Nevertheless, whether these models are used autonomously or as a simple decision-making support tool, they are already being used in machine learning applications where health and human life are at stake. Therefore, it appears to be an obvious necessity not to blindly believe everything coming out of those models without a detailed understanding of their predictions or decisions. Accordingly, this thesis aims at improving the interpretability of models built by a specific family of machine learning algorithms, the so-called tree-based methods. Several mechanisms have been proposed to interpret these models and we aim along this thesis to improve their understanding, study their properties, and define their limitations.

computational statistic & data analysis, predictive feature, random forest variable importance measure, (16 more...)

arXiv.org Machine Learning

2106.09473

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
North America > United States > California > Alameda County > Berkeley (0.13)
Europe > Spain > Canary Islands (0.04)
(16 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (1.00)
(3 more...)

Ruan, Feng, Liu, Keli, Jordan, Michael I.

Taming Nonconvexity in Kernel Feature Selection---Favorable Properties of the Laplace Kernel

arXiv.org Machine LearningJun-17-2021

Kernel-based feature selection is an important tool in nonparametric statistics. Despite many practical applications of kernel-based feature selection, there is little statistical theory available to support the method. A core challenge is the objective function of the optimization problems used to define kernel-based feature selection are nonconvex. The literature has only studied the statistical properties of the \emph{global optima}, which is a mismatch, given that the gradient-based algorithms available for nonconvex optimization are only able to guarantee convergence to local minima. Studying the full landscape associated with kernel-based methods, we show that feature selection objectives using the Laplace kernel (and other $\ell_1$ kernels) come with statistical guarantees that other kernels, including the ubiquitous Gaussian kernel (or other $\ell_2$ kernels) do not possess. Based on a sharp characterization of the gradient of the objective function, we show that $\ell_1$ kernels eliminate unfavorable stationary points that appear when using an $\ell_2$ kernel. Armed with this insight, we establish statistical guarantees for $\ell_1$ kernel-based feature selection which do not require reaching the global minima. In particular, we establish model-selection consistency of $\ell_1$-kernel-based feature selection in recovering main effects and hierarchical interactions in the nonparametric setting with $n \sim \log p$ samples.

equation, kernel, probability, (16 more...)

arXiv.org Machine Learning

2106.09387

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.67)

#artificialintelligenceJun-16-2021, 14:55:40 GMT

Sentiment Analysis

Sentiment Analysis, as the name suggests, it means to identify the view or emotion behind a situation. It basically means to analyze and find the emotion or intent behind a piece of text or speech or any mode of communication. In this article, we will focus on the sentiment analysis of text data. We, humans, communicate with each other in a variety of languages, and any language is just a mediator or a way in which we try to express ourselves. And, whatever we say has a sentiment associated with it.

customer, sentiment, sentiment analysis, (12 more...)

#artificialintelligence

Industry: Consumer Products & Services (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.85)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.30)

arXiv.org Artificial IntelligenceJun-16-2021

Structured DropConnect for Uncertainty Inference in Image Classification

Zheng, Wenqing, Xie, Jiyang, Liu, Weidong, Ma, Zhanyu

With the complexity of the network structure, uncertainty inference has become an important task to improve classification accuracy for artificial intelligence systems. For image classification tasks, we propose a structured DropConnect (SDC) framework to model the output of a deep neural network by a Dirichlet distribution. We introduce a DropConnect strategy on weights in the fully connected layers during training. In test, we split the network into several sub-networks, and then model the Dirichlet distribution by match its moments with the mean and variance of the outputs of these sub-networks. The entropy of the estimated Dirichlet distribution is finally utilized for uncertainty inference. In this paper, this framework is implemented on LeNet5 and VGG16 models for misclassification detection and out-of-distribution detection on MNIST and CIFAR-10 datasets. Experimental results show that the performance of the proposed SDC can be comparable to other uncertainty inference methods. Furthermore, the SDC is adapted well to different Figure 1: Illustration of the proposed structured DropConnect network structures with certain generalization capabilities and (SDC). In train phase, DropConnect is used on the research prospects.

dirichlet distribution, dropconnect, uncertainty inference, (11 more...)

2106.08624

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)