AITopics

2211.0625

Country:

North America > United States > Virginia (0.04)
Europe > Netherlands (0.04)
Asia > Japan > Honshū > Chūbu > Nagano Prefecture > Nagano (0.04)
Asia > India (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

#artificialintelligenceNov-10-2022, 20:45:08 GMT

WORLD OF CLASSIFICATION IN MACHINE LEARNING

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider...

algorithm, node, root node, (14 more...)

#artificialintelligence

Industry: Information Technology > Security & Privacy (0.32)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

#artificialintelligenceNov-10-2022, 15:31:15 GMT

Current Insights on AI, Breast Cancer Screening and the FDA

Is there enough scrutiny of artificial intelligence (AI) software prior to clearance by the Food and Drug Administration (FDA) for adjunctive use in breast cancer screening? Despite the FDA clearance in recent years of several AI products to help identify suspicious breast lesions and facilitate mammography triage, researchers suggested in a recent review, published in JAMA Internal Medicine, that questions remain about data sources, clinical outcome measures and external validation. Here are a few takeaways from their review of the research leading to FDA clearance for nine AI-related products for breast cancer screening between January 1, 2017 and December 31, 2021. All of the clearances for the AI products were based on retrospective analysis of previously existing databases. Only six of the nine products had multicenter studies to support their use and research for four of the AI products lacked information about external validation, according to the review.

ai product, breast cancer screening, richman and colleague, (11 more...)

#artificialintelligence

Country: North America > United States (1.00)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.96)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Government Relations & Public Policy (1.00)
Government > Regional Government > North America Government > United States Government > FDA (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.32)

Pawelczyk, Martin, Lakkaraju, Himabindu, Neel, Seth

On the Privacy Risks of Algorithmic Recourse

As predictive models are increasingly being employed to make consequential decisions, there is a growing emphasis on developing techniques that can provide algorithmic recourse to affected individuals. While such recourses can be immensely beneficial to affected individuals, potential adversaries could also exploit these recourses to compromise privacy. In this work, we make the first attempt at investigating if and how an adversary can leverage recourses to infer private information about the underlying model's training data. To this end, we propose a series of novel membership inference attacks which leverage algorithmic recourse. More specifically, we extend the prior literature on membership inference attacks to the recourse setting by leveraging the distances between data instances and their corresponding counterfactuals output by state-of-the-art recourse methods. Extensive experimentation with real world and synthetic datasets demonstrates significant privacy leakage through recourses. Our work establishes unintended privacy leakage as an important risk in the widespread adoption of recourse methods.

artificial intelligence, machine learning, recourse, (14 more...)

2211.05427

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

WEKA-Based: Key Features and Classifier for French of Five Countries

Li, Zeqian, Qiu, Keyu, Jiao, Chenxu, Zhu, Wen, Tang, Haoran

This paper describes a French dialect recognition system that will appropriately distinguish between different regional French dialects. A corpus of five regions - Monaco, French-speaking, Belgium, French-speaking Switzerland, French-speaking Canada and France, which is targeted forconstruction by the Sketch Engine. The content of the corpus is related to the four themes of eating, drinking, sleeping and living, which are closely linked to popular life. The experimental results were obtained through the processing of a python coded pre-processor and Waikato Environment for Knowledge Analysis (WEKA) data analytic tool which contains many filters and classifiers for machine learning.

artificial intelligence, classifier, machine learning, (14 more...)

2212.08132

Country:

North America > Canada (0.25)
Europe > Switzerland (0.25)
Europe > France (0.25)
(5 more...)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Majumder, Suvodeep, Chakraborty, Joymallya, Menzies, Tim

When Less is More: On the Value of "Co-training" for Semi-Supervised Software Defect Predictors

Labeling a module defective or non-defective is an expensive task. Hence, there are often limits on how much-labeled data is available for training. Semi-supervised classifiers use far fewer labels for training models, but there are numerous semi-supervised methods, including self-labeling, co-training, maximal-margin, and graph-based methods, to name a few. Only a handful of these methods have been tested in SE for (e.g.) predicting defects and even that, those tests have been on just a handful of projects. This paper takes a wide range of 55 semi-supervised learners and applies these to over 714 projects. We find that semi-supervised "co-training methods" work significantly better than other approaches. However, co-training needs to be used with caution since the specific choice of co-training methods needs to be carefully selected based on a user's specific goals. Also, we warn that a commonly-used co-training method ("multi-view"-- where different learners get different sets of columns) does not improve predictions (while adding too much to the run time costs 11 hours vs. 1.8 hours). Those cautions stated, we find using these "co-trainers," we can label just 2.5% of data, then make predictions that are competitive to those using 100% of the data. It is an open question worthy of future work to test if these reductions can be seen in other areas of software analytics. All the codes used and datasets analyzed during the current study are available in the https://GitHub.com/Suvodeep90/Semi_Supervised_Methods.

data mining, evolutionary algorithm, machine learning, (18 more...)

2211.0592

Country:

North America > United States > North Carolina (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia (0.04)

Genre:

Workflow (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (0.68)

Industry:

Education (0.66)
Health & Medicine (0.46)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
(8 more...)

Parraga, Otávio, More, Martin D., Oliveira, Christian M., Gavenski, Nathan S., Kupssinskü, Lucas S., Medronha, Adilson, Moura, Luis V., Simões, Gabriel S., Barros, Rodrigo C.

Debiasing Methods for Fairer Neural Models in Vision and Language Research: A Survey

Despite being responsible for state-of-the-art results in several computer vision and natural language processing tasks, neural networks have faced harsh criticism due to some of their current shortcomings. One of them is that neural networks are correlation machines prone to model biases within the data instead of focusing on actual useful causal relationships. This problem is particularly serious in application domains affected by aspects such as race, gender, and age. To prevent models from incurring on unfair decision-making, the AI community has concentrated efforts in correcting algorithmic biases, giving rise to the research area now widely known as fairness in AI. In this survey paper, we provide an in-depth overview of the main debiasing methods for fairness-aware neural networks in the context of vision and language research. We propose a novel taxonomy to better organize the literature on debiasing methods for fairness, and we discuss the current challenges, trends, and important future work directions for the interested researcher and practitioner.

artificial intelligence, machine learning, natural language, (16 more...)

2211.05617

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
South America > Brazil (0.04)
North America > Canada > Quebec > Montreal (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Law (0.67)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Rawat, Satyendra Singh, Mishra, Amit Kumar

Review of Methods for Handling Class-Imbalanced in Classification Problems

Learning classifiers using skewed or imbalanced datasets can occasionally lead to classification issues; this is a serious issue. In some cases, one class contains the majority of examples while the other, which is frequently the more important class, is nevertheless represented by a smaller proportion of examples. Using this kind of data could make many carefully designed machine-learning systems ineffective. High training fidelity was a term used to describe biases vs. all other instances of the class. The best approach to all possible remedies to this issue is typically to gain from the minority class. The article examines the most widely used methods for addressing the problem of learning with a class imbalance, including data-level, algorithm-level, hybrid, cost-sensitive learning, and deep learning, etc. including their advantages and limitations. The efficiency and performance of the classifier are assessed using a myriad of evaluation metrics.

algorithm, artificial intelligence, machine learning, (15 more...)

2211.05456

Country: Asia > India (0.04)

Genre:

Overview (1.00)
Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Law Enforcement & Public Safety > Fraud (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Ramachandra, Sandeep, Vandewiele, Gilles, Mijnsbrugge, David Vander, Ongenae, Femke, Van Hoecke, Sofie

Perfectly predicting ICU length of stay: too good to be true

A paper of Alsinglawi et al was recently accepted and published in Scientific Reports. In this paper, the authors aim to predict length of stay (LOS), discretized into either long (> 7 days) or short stays (< 7 days), of lung cancer patients in an ICU department using various machine learning techniques. The authors claim to achieve perfect results with an Area Under the Receiver Operating Characteristic curve (AUROC) of 100% with a Random Forest (RF) classifier with ADASYN class balancing over sampling technique, which if accurate could have significant implications for hospital management. However, we have identified several methodological flaws within the manuscript which cause the results to be overly optimistic and would have serious consequences if used in a clinical practice. Moreover, the reporting of the methodology is unclear and many important details are missing from the manuscript, which makes reproduction extremely difficult. We highlight the effect these oversights have had on the result and provide a more believable result of 88.91% AUROC when these oversights are corrected.

artificial intelligence, dataset, machine learning, (18 more...)

2211.05597

Country: Europe > Belgium > Flanders > East Flanders > Ghent (0.05)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.36)

#artificialintelligenceNov-9-2022, 20:00:37 GMT

Applications of Naive Bayes part1(Artificial Intelligence)

Abstract: In many classification models, data is discretized to better estimate its distribution. Existing discretization methods often target at maximizing the discriminant power of discretized data, while overlooking the fact that the primary target of data discretization in classification is to improve the generalization performance. As a result, the data tend to be over-split into many small bins since the data without discretization retain the maximal discriminant information. Thus, we propose a Max-Dependency-Min-Divergence (MDmD) criterion that maximizes both the discriminant information and generalization ability of the discretized data. More specifically, the Max-Dependency criterion maximizes the statistical dependency between the discretized data and the classification variable while the Min-Divergence criterion explicitly minimizes the JS-divergence between the training data and the validation data for a given discretization scheme.

application, artificial intelligence, discriminant information, (11 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.78)