AITopics

2010.00718

Country:

North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Iowa > Story County > Ames (0.04)
North America > United States > Alabama > Jefferson County > Birmingham (0.04)
(2 more...)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Cross Validation (0.61)

Banegas-Luna, Antonio-Jesús, Peña-García, Jorge, Iftene, Adrian, Guadagni, Fiorella, Ferroni, Patrizia, Scarpato, Noemi, Zanzotto, Fabio Massimo, Bueno-Crespo, Andrés, Pérez-Sánchez, Horacio

When will the mist clear? On the Interpretability of Machine Learning for Medical Applications: a survey

arXiv.org Artificial IntelligenceOct-1-2020

Artificial Intelligence is providing astonishing results, with medicine being one of its favourite playgrounds. In a few decades, computers may be capable of formulating diagnoses and choosing the correct treatment, while robots may perform surgical operations, and conversational agents could interact with patients as virtual coaches. Machine Learning and, in particular, Deep Neural Networks are behind this revolution. In this scenario, important decisions will be controlled by standalone machines that have learned predictive models from provided data. Among the most challenging targets of interest in medicine are cancer diagnosis and therapies but, to start this revolution, software tools need to be adapted to cover the new requirements. In this sense, learning tools are becoming a commodity in Python and Matlab libraries, just to name two, but to exploit all their possibilities, it is essential to fully understand how models are interpreted and which models are more interpretable than others. In this survey, we analyse current machine learning models, frameworks, databases and other related tools as applied to medicine - specifically, to cancer research - and we discuss their interpretability, performance and the necessary input data. From the evidence available, ANN, LR and SVM have been observed to be the preferred models. Besides, CNNs, supported by the rapid development of GPUs and tensor-oriented programming libraries, are gaining in importance. However, the interpretability of results by doctors is rarely considered which is a factor that needs to be improved. We therefore consider this study to be a timely contribution to the issue.

artificial intelligence, expert system, machine learning, (18 more...)

2010.00353

Country:

Europe > Italy > Lazio > Rome (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Oceania > New Zealand > North Island > Waikato (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology > Colorectal Cancer (0.47)
Health & Medicine > Therapeutic Area > Oncology > Head & Neck Cancer (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.87)

Santana, Marlesson R. O., Melo, Luckeciano C., Camargo, Fernando H. F., Brandão, Bruno, Soares, Anderson, Oliveira, Renan M., Caetano, Sandor

MARS-Gym: A Gym framework to model, train, and evaluate Recommender Systems for Marketplaces

arXiv.org Machine LearningSep-30-2020

Recommender Systems are especially challenging for marketplaces since they must maximize user satisfaction while maintaining the healthiness and fairness of such ecosystems. In this context, we observed a lack of resources to design, train, and evaluate agents that learn by interacting within these environments. For this matter, we propose MARS-Gym, an open-source framework to empower researchers and engineers to quickly build and evaluate Reinforcement Learning agents for recommendations in marketplaces. MARS-Gym addresses the whole development pipeline: data processing, model design and optimization, and multi-sided evaluation. We also provide the implementation of a diverse set of baseline agents, with a metrics-driven analysis of them in the Trivago marketplace dataset, to illustrate how to conduct a holistic assessment using the available metrics of recommendation, off-policy estimation, and fairness. With MARS-Gym, we expect to bridge the gap between academic research and production systems, as well as to facilitate the design of new algorithms and applications.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

2010.07035

Country:

North America > United States > New York > New York County > New York City (0.05)
South America > Brazil > Goiás (0.05)
North America > United States > Illinois > Cook County > Chicago (0.04)
(4 more...)

Genre: Research Report (0.82)

Industry:

Leisure & Entertainment (0.93)
Consumer Products & Services (0.93)
Information Technology > Services (0.68)
Media > Music (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.87)
(2 more...)

Pei, Yulong, Huang, Tianjin, van Ipenburg, Werner, Pechenizkiy, Mykola

ResGCN: Attention-based Deep Residual Modeling for Anomaly Detection on Attributed Networks

arXiv.org Machine LearningSep-30-2020

Effectively detecting anomalous nodes in attributed networks is crucial for the success of many real-world applications such as fraud and intrusion detection. Existing approaches have difficulties with three major issues: sparsity and nonlinearity capturing, residual modeling, and network smoothing. We propose Residual Graph Convolutional Network (ResGCN), an attention-based deep residual modeling approach that can tackle these issues: modeling the attributed networks with GCN allows to capture the sparsity and nonlinearity; utilizing a deep neural network allows to directly learn residual from the input, and a residual-based attention mechanism reduces the adverse effect from anomalous nodes and prevents over-smoothing. Extensive experiments on several real-world attributed networks demonstrate the effectiveness of ResGCN in detecting anomalies.

artificial intelligence, data mining, machine learning, (19 more...)

2009.14738

Country:

Europe > Netherlands > North Brabant > Eindhoven (0.04)
North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (0.66)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Hossain, Md Delowar, Kabir, Muhammad Ashad, Anwar, Adnan, Islam, Md Zahidul

Detecting Autism Spectrum Disorder using Machine Learning

arXiv.org Machine LearningSep-30-2020

Autism Spectrum Disorder (ASD), which is a neuro development disorder, is often accompanied by sensory issues such an over sensitivity or under sensitivity to sounds and smells or touch. Although its main cause is genetics in nature, early detection and treatment can help to improve the conditions. In recent years, machine learning based intelligent diagnosis has been evolved to complement the traditional clinical methods which can be time consuming and expensive. The focus of this paper is to find out the most significant traits and automate the diagnosis process using available classification techniques for improved diagnosis purpose. We have analyzed ASD datasets of Toddler, Child, Adolescent and Adult. We determine the best performing classifier for these binary datasets using the evaluation metrics recall, precision, F-measures and classification errors. Our finding shows that Sequential minimal optimization (SMO) based Support Vector Machines (SVM) classifier outperforms all other benchmark machine learning algorithms in terms of accuracy during the detection of ASD cases and produces less classification errors compared to other algorithms. Also, we find that Relief Attributes algorithm is the best to identify the most significant attributes in ASD datasets.

artificial intelligence, dataset, machine learning, (13 more...)

2009.14499

Country:

Oceania > Australia (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology > Autism (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.88)

arXiv.org Artificial IntelligenceSep-30-2020

Uncertainty Estimation For Community Standards Violation In Online Social Networks

Torabi, Narjes, Arora, Nimar S., Yu, Emma, Shah, Kinjal, Liu, Wenshun, Tingley, Michael

Online Social Networks (OSNs) provide a platform for users to share their thoughts and opinions with their community of friends or to the general public. In order to keep the platform safe for all users, as well as to keep it compliant with local laws, OSNs typically create a set of community standards organized into policy groups, and use Machine Learning (ML) models to identify and remove content that violates any of the policies. However, out of the billions of content that is uploaded on a daily basis only a small fraction is so unambiguously violating that it can be removed by the automated models. Prevalence estimation is the task of estimating the fraction of violating content in the residual items by sending a small sample of these items to human labelers to get ground truth labels. This task is exceedingly hard because even though we can easily get the ML scores or features for all of the billions of items we can only get ground truth labels on a few thousands of these items due to practical considerations. Indeed the prevalence can be so low that even after a judicious choice of items to be labeled there can be many days in which not even a single item is labeled violating. A pragmatic choice for such low prevalence, $10^{-4}$ to $10^{-5}$, regimes is to report the upper bound, or $97.5\%$ confidence interval, prevalence (UBP) that takes the uncertainties of the sampling and labeling processes into account and gives a smoothed estimate. In this work we present two novel techniques Bucketed-Beta-Binomial and a Bucketed-Gaussian Process for this UBP task and demonstrate on real and simulated data that it has much better coverage than the commonly used bootstrapping technique.

artificial intelligence, machine learning, prevalence, (20 more...)

2009.14519

Country: North America > United States > Minnesota (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.70)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Tsigler, A., Bartlett, P. L.

Benign overfitting in ridge regression

arXiv.org Machine LearningSep-29-2020

Classical learning theory suggests that strong regularization is needed to learn a class with large complexity. This intuition is in contrast with the modern practice of machine learning, in particular learning neural networks, where the number of parameters often exceeds the number of data points. It has been observed empirically that such overparametrized models can show good generalization performance even if trained with vanishing or negative regularization. The aim of this work is to understand theoretically how this effect can occur, by studying the setting of ridge regression. We provide non-asymptotic generalization bounds for overparametrized ridge regression that depend on the arbitrary covariance structure of the data, and show that those bounds are tight for a range of regularization parameter values. To our knowledge this is the first work that studies overparametrized ridge regression in such a general setting. We identify when small or negative regularization is sufficient for obtaining small generalization error. On the technical side, our bounds only require the data vectors to be i.i.d. sub-gaussian, while most previous work assumes independence of the components of those vectors.

artificial intelligence, machine learning, probability, (18 more...)

2009.14286

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Englhardt, Adrian, Trittenbach, Holger, Kottke, Daniel, Sick, Bernhard, Böhm, Klemens

Efficient SVDD Sampling with Approximation Guarantees for the Decision Boundary

arXiv.org Machine LearningSep-29-2020

Support Vector Data Description (SVDD) is a popular one-class classifiers for anomaly and novelty detection. But despite its effectiveness, SVDD does not scale well with data size. To avoid prohibitive training times, sampling methods select small subsets of the training data on which SVDD trains a decision boundary hopefully equivalent to the one obtained on the full data set. According to the literature, a good sample should therefore contain so-called boundary observations that SVDD would select as support vectors on the full data set. However, non-boundary observations also are essential to not fragment contiguous inlier regions and avoid poor classification accuracy. Other aspects, such as selecting a sufficiently representative sample, are important as well. But existing sampling methods largely overlook them, resulting in poor classification accuracy. In this article, we study how to select a sample considering these points. Our approach is to frame SVDD sampling as an optimization problem, where constraints guarantee that sampling indeed approximates the original decision boundary. We then propose RAPID, an efficient algorithm to solve this optimization problem. RAPID does not require any tuning of parameters, is easy to implement and scales well to large data sets. We evaluate our approach on real-world and synthetic data. Our evaluation is the most comprehensive one for SVDD sampling so far. Our results show that RAPID outperforms its competitors in classification accuracy, in sample size, and in runtime.

artificial intelligence, machine learning, svdd, (15 more...)

2009.13853

Country: Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.58)

arXiv.org Artificial IntelligenceSep-29-2020

SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and Synonym Discovery

Shen, Jiaming, Qiu, Wenda, Shang, Jingbo, Vanni, Michelle, Ren, Xiang, Han, Jiawei

Entity set expansion and synonym discovery are two critical NLP tasks. Previous studies accomplish them separately, without exploring their interdependencies. In this work, we hypothesize that these two tasks are tightly coupled because two synonymous entities tend to have similar likelihoods of belonging to various semantic classes. This motivates us to design SynSetExpan, a novel framework that enables two tasks to mutually enhance each other. SynSetExpan uses a synonym discovery model to include popular entities' infrequent synonyms into the set, which boosts the set expansion recall. Meanwhile, the set expansion model, being able to determine whether an entity belongs to a semantic class, can generate pseudo training data to fine-tune the synonym discovery model towards better accuracy. To facilitate the research on studying the interplays of these two tasks, we create the first large-scale Synonym-Enhanced Set Expansion (SE2) dataset via crowdsourcing. Extensive experiments on the SE2 dataset and previous benchmarks demonstrate the effectiveness of SynSetExpan for both entity set expansion and synonym discovery tasks.

artificial intelligence, machine learning, natural language, (18 more...)

2009.13827

Country:

Asia > China > Tibet Autonomous Region (0.14)
Europe > United Kingdom > England > West Sussex (0.14)
North America > United States > Texas (0.05)
(27 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Basketball (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.93)

Lalitha, S. D., Thyagharajan, K. K.

Micro-Facial Expression Recognition in Video Based on Optimal Convolutional Neural Network (MFEOCNN) Algorithm

arXiv.org Artificial IntelligenceSep-29-2020

Facial expression is a standout amongst the most imperative features of human emotion recognition. For demonstrating the emotional states facial expressions are utilized by the people. In any case, recognition of facial expressions has persisted a testing and intriguing issue with regards to PC vision. Recognizing the Micro-Facial expression in video sequence is the main objective of the proposed approach. For efficient recognition, the proposed method utilizes the optimal convolution neural network. Here the proposed method considering the input dataset is the CK+ dataset. At first, by means of Adaptive median filtering preprocessing is performed in the input image. From the preprocessed output, the extracted features are Geometric features, Histogram of Oriented Gradients features and Local binary pattern features. The novelty of the proposed method is, with the help of Modified Lion Optimization (MLO) algorithm, the optimal features are selected from the extracted features. In a shorter computational time, it has the benefits of rapidly focalizing and effectively acknowledging with the aim of getting an overall arrangement or idea. Finally, the recognition is done by Convolution Neural network (CNN). Then the performance of the proposed MFEOCNN method is analysed in terms of false measures and recognition accuracy. This kind of emotion recognition is mainly used in medicine, marketing, E-learning, entertainment, law and monitoring. From the simulation, we know that the proposed approach achieves maximum recognition accuracy of 99.2% with minimum Mean Absolute Error (MAE) value. These results are compared with the existing for MicroFacial Expression Based Deep-Rooted Learning (MFEDRL), Convolutional Neural Network with Lion Optimization (CNN+LO) and Convolutional Neural Network (CNN) without optimization. The simulation of the proposed method is done in the working platform of MATLAB.

artificial intelligence, machine learning, recognition, (18 more...)

doi: 10.35940/ijeat.A9802.109119

2009.13792

Country: Asia > India > Tamil Nadu > Chennai (0.04)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment (0.54)
Law (0.54)
Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision > Face Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)