AITopics | Collaboration, AIX-COVNET

Collaborating Authors

Collaboration, AIX-COVNET

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Navigating the challenges in creating complex data systems: a development philosophy

Dittmer, Sören, Roberts, Michael, Gilbey, Julian, Biguri, Ander, Collaboration, AIX-COVNET, Preller, Jacobus, Rudd, James H. F., Aston, John A. D., Schönlieb, Carola-Bibiane

arXiv.org Artificial IntelligenceOct-21-2022

In this perspective, we argue that despite the democratization of powerful tools for data science and machine learning over the last decade, developing the code for a trustworthy and effective data science system (DSS) is getting harder. Perverse incentives and a lack of widespread software engineering (SE) skills are among many root causes we identify that naturally give rise to the current systemic crisis in reproducibility of DSSs. We analyze why SE and building large complex systems is, in general, hard. Based on these insights, we identify how SE addresses those difficulties and how we can apply and generalize SE methods to construct DSSs that are fit for purpose. We advocate two key development philosophies, namely that one should incrementally grow - not biphasically plan and build - DSSs, and one should always employ two types of feedback loops during development: one which tests the code's correctness and another that evaluates the code's efficacy. Machine learning is in a reproducibility crisis [Hai+20; the code produces identical results - for replicability and Pin+21; Bak16]. We argue that a primary driver is poor code general reproducibility using independent implementations, quality, having two root causes: poor incentives to produce correctness is crucial.

artificial intelligence, feedback loop, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s42256-023-00665-x

2210.13191

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.30)

Genre: Research Report > Experimental Study (0.46)

Industry: Health & Medicine > Therapeutic Area (0.93)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Classification of datasets with imputed missing values: does imputation quality matter?

Shadbahr, Tolou, Roberts, Michael, Stanczuk, Jan, Gilbey, Julian, Teare, Philip, Dittmer, Sören, Thorpe, Matthew, Torne, Ramon Vinas, Sala, Evis, Lio, Pietro, Patel, Mishal, Collaboration, AIX-COVNET, Rudd, James H. F., Mirtti, Tuomas, Rannikko, Antti, Aston, John A. D., Tang, Jing, Schönlieb, Carola-Bibiane

arXiv.org Artificial IntelligenceJun-16-2022

Classifying samples in incomplete datasets is a common aim for machine learning practitioners, but is non-trivial. Missing data is found in most real-world datasets and these missing values are typically imputed using established methods, followed by classification of the now complete, imputed, samples. The focus of the machine learning researcher is then to optimise the downstream classification performance. In this study, we highlight that it is imperative to consider the quality of the imputation. We demonstrate how the commonly used measures for assessing quality are flawed and propose a new class of discrepancy scores which focus on how well the method recreates the overall distribution of the data. To conclude, we highlight the compromised interpretability of classifier models trained using poorly imputed data. All code and data used in this paper are also released publicly at [inserted upon publication].

artificial intelligence, dataset, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s43856-023-00356-z

2206.08478

Country:

North America (0.92)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Health Care Providers & Services (0.93)
Health & Medicine > Diagnostic Medicine > Imaging (0.92)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.67)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

Add feedback