AITopics | Mirkes, Evgeny M.

Collaborating Authors

Mirkes, Evgeny M.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

What is Hiding in Medicine's Dark Matter? Learning with Missing Data in Medical Practices

Suzen, Neslihan, Mirkes, Evgeny M., Roland, Damian, Levesley, Jeremy, Gorban, Alexander N., Coats, Tim J.

arXiv.org Artificial IntelligenceFeb-9-2024

Electronic patient records (EPRs) produce a wealth of data but contain significant missing information. Understanding and handling this missing data is an important part of clinical data analysis and if left unaddressed could result in bias in analysis and distortion in critical conclusions. Missing data may be linked to health care professional practice patterns and imputation of missing data can increase the validity of clinical decisions. This study focuses on statistical approaches for understanding and interpreting the missing data and machine learning based clinical data imputation using a single centre's paediatric emergency data and the data from UK's largest clinical audit for traumatic injury database (TARN). In the study of 56,961 data points related to initial vital signs and observations taken on children presenting to an Emergency Department, we have shown that missing data are likely to be non-random and how these are linked to health care professional practice patterns. We have then examined 79 TARN fields with missing values for 5,791 trauma cases. Singular Value Decomposition (SVD) and k-Nearest Neighbour (kNN) based missing data imputation methods are used and imputation results against the original dataset are compared and statistically tested. We have concluded that the 1NN imputer is the best imputation which indicates a usual pattern of clinical decision making: find the most similar patients and take their attributes as imputation.

data quality, dataset, imputation, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/BigData59044.2023.10386194

2402.06563

Country:

North America > United States > Alaska > North Slope Borough (0.24)
Europe > United Kingdom > England (0.16)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Consumer Health (0.86)
Health & Medicine > Health Care Technology > Medical Record (0.48)
Health & Medicine > Diagnostic Medicine > Vital Signs (0.34)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Tyukin, Ivan Y., Tyukina, Tatiana, van Helden, Daniel, Zheng, Zedong, Mirkes, Evgeny M., Sutton, Oliver J., Zhou, Qinghua, Gorban, Alexander N., Allison, Penelope

arXiv.org Artificial IntelligenceFeb-6-2024

We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously constructed underlying classifier by either approving or rejecting its decisions. The rejection of a decision can be used as a signal to suggest abstaining from making a decision. A key technical focus of the work is in providing performance guarantees for these new AI correctors through bounds on the probabilities of incorrect decisions. These bounds are distribution agnostic and do not rely on assumptions on the data dimension. Our empirical example illustrates how the framework can be applied to improve the performance of an image classifier in a challenging real-world task where training data are scarce.

artificial intelligence, corrector, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2402.00899

Country: Europe > United Kingdom (0.15)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Scikit-dimension: a Python package for intrinsic dimension estimation

Bac, Jonathan, Mirkes, Evgeny M., Gorban, Alexander N., Tyukin, Ivan, Zinovyev, Andrei

arXiv.org Machine LearningSep-6-2021

Dealing with uncertainty in applications of machine learning to real-life data critically depends on the knowledge of intrinsic dimensionality (ID). A number of methods have been suggested for the purpose of estimating ID, but no standard package to easily apply them one by one or all at once has been implemented in Python. This technical note introduces \texttt{scikit-dimension}, an open-source Python package for intrinsic dimension estimation. \texttt{scikit-dimension} package provides a uniform implementation of most of the known ID estimators based on scikit-learn application programming interface to evaluate global and local intrinsic dimension, as well as generators of synthetic toy and benchmark datasets widespread in the literature. The package is developed with tools assessing the code quality, coverage, unit testing and continuous integration. We briefly describe the package and demonstrate its use in a large-scale (more than 500 datasets) benchmarking of methods for ID estimation in real-life and synthetic data. The source code is available from https://github.com/j-bac/scikit-dimension , the documentation is available from https://scikit-dimension.readthedocs.io .

dataset, health & medicine, oncology, (18 more...)

arXiv.org Machine Learning

2109.02596

Country:

Europe > France (0.29)
Europe > Russia (0.28)
Europe > United Kingdom > England (0.14)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Add feedback

High-dimensional separability for one- and few-shot learning

Gorban, Alexander N., Grechuk, Bogdan, Mirkes, Evgeny M., Stasenko, Sergey V., Tyukin, Ivan Y.

arXiv.org Artificial IntelligenceJun-28-2021

This work is driven by a practical question, corrections of Artificial Intelligence (AI) errors. Systematic re-training of a large AI system is hardly possible. To solve this problem, special external devices, correctors, are developed. They should provide quick and non-iterative system fix without modification of a legacy AI system. A common universal part of the AI corrector is a classifier that should separate undesired and erroneous behavior from normal operation. Training of such classifiers is a grand challenge at the heart of the one- and few-shot learning methods. Effectiveness of one- and few-short methods is based on either significant dimensionality reductions or the blessing of dimensionality effects. Stochastic separability is a blessing of dimensionality phenomenon that allows one-and few-shot error correction: in high-dimensional datasets under broad assumptions each point can be separated from the rest of the set by simple and robust linear discriminant. The hierarchical structure of data universe is introduced where each data cluster has a granular internal structure, etc. New stochastic separation theorems for the data distributions with fine-grained structure are formulated and proved. Separation theorems in infinite-dimensional limits are proven under assumptions of compact embedding of patterns into data space. New multi-correctors of AI systems are presented and illustrated with examples of predicting errors and learning new classes of objects by a deep convolutional neural network.

certification method, machine learning, teaching method, (31 more...)

arXiv.org Artificial Intelligence

2106.15416

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Overview (0.45)
Research Report (0.40)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Add feedback

Robust and scalable learning of data manifolds with complex topologies via ElPiGraph

Albergante, Luca, Mirkes, Evgeny M., Chen, Huidong, Martin, Alexis, Faure, Louis, Barillot, Emmanuel, Pinello, Luca, Gorban, Alexander N., Zinovyev, Andrei

arXiv.org Machine LearningApr-20-2018

We present ElPiGraph, a method for approximating data distributions having non-trivial topological features such as the existence of excluded regions or branching structures. Unlike many existing methods, ElPiGraph is not based on the construction of a k-nearest neighbour graph, a procedure that can perform poorly in the case of multidimensional and noisy data. Instead, ElPiGraph constructs elastic principal graphs in a more robust way by minimizing elastic energy, applying graph grammars and explicitly controlling topological complexity. Using trimmed approximation error function makes ElPiGraph extremely robust to the presence of background noise without decreasing computational performance and allows it to deal with complex cases of manifold learning (for example, ElPiGraph can learn disconnected intersecting manifolds). Thanks to the quasi-quadratic nature of the elastic function, ElPiGraph performs almost as fast as a simple k-means clustering and, therefore, is much more scalable than alternative methods, and can work on large datasets containing millions of high dimensional points on a personal computer. The excellent performance of the method opens the possibility to apply resampling and to approximate complex data structures via principal graph ensembles which can be used to construct consensus principal graphs. ElPiGraph is currently implemented in five programming languages and accompanied by a graphical user interface, which makes it a versatile tool to deal with complex data in various fields from molecular biology, where it can be used to infer pseudo-time trajectories from single-cell RNASeq, to astronomy, where it can be used to approximate complex structures in the distribution of galaxies.

artificial intelligence, elpigraph, health & medicine, (19 more...)

arXiv.org Machine Learning

1804.0758

Country:

Europe (0.93)
North America > United States > Massachusetts (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback