Collaborating Authors


AI with swarm intelligence: A novel technology for cooperative analysis of big data


Science and medicine are becoming increasingly digital. Analyzing the resulting volumes of information -- known as "big data" -- is considered a key to better treatment options. "Medical research data are a treasure. They can play a decisive role in developing personalized therapies that are tailored to each individual more precisely than conventional treatments," said Joachim Schultze, Director of Systems Medicine at the DZNE and professor at the Life & Medical Sciences Institute (LIMES) at the University of Bonn. "It's critical for science to be able to use such data as comprehensively and from as many sources as possible."

AI with swarm intelligence learns to detect cancer, lung diseases and COVID-19


Following a similar principle--called "swarm learning"--an international research team has trained artificial intelligence algorithms to detect blood cancer, lung diseases and COVID-19 in data stored in a decentralized fashion. This approach has advantage over conventional methods since it inherently provides privacy preservation technologies, which facilitates cross-site analysis of scientific data. Swarm learning could thus significantly promote and accelerate collaboration and information exchange in research, especially in the field of medicine. Experts from the DZNE, the University of Bonn, the information technology company Hewlett Packard Enterprise (HPE) and other research institutions report on this in the scientific journal Nature. Science and medicine are becoming increasingly digital.

Thought Leadership Webcast -- AI Ethics


Amita Kapoor is an Associate Professor in the Department of Electronics, SRCASW, University of Delhi and has been actively teaching neural networks and artificial intelligence for the last 20 years. She completed her masters in Electronics in 1996 and Ph.D. in 2011, during Ph.D. she was awarded a prestigious DAAD fellowship to pursue a part of her research work in Karlsruhe Institute of Technology, Karlsruhe, Germany. She was awarded the Best Presentation Award at the Photonics 2008 international conference. She is an active member of ACM, AAAI, IEEE, and INNS. She has co-authored four books including the best-selling book "Deep learning with TensorFlow2 and Keras" with Packt Publications.

Submit Abstract - Pathology Utilitarian Conference


Do you have a latest findings on Pathology, Digital Pathology, submit your papers today to enroll & participate at the Pathology Utilitarian Conference & Digital Pathology Meetings.

How rotational invariance of common kernels prevents generalization in high dimensions Machine Learning

Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings. However, its behavior in high dimensions is much less understood. Recent work establishes consistency for kernel regression under certain assumptions on the ground truth function and the distribution of the input data. In this paper, we show that the rotational invariance property of commonly studied kernels (such as RBF, inner product kernels and fully-connected NTK of any depth) induces a bias towards low-degree polynomials in high dimensions. Our result implies a lower bound on the generalization error for a wide range of distributions and various choices of the scaling for kernels with different eigenvalue decays. This lower bound suggests that general consistency results for kernel ridge regression in high dimensions require a more refined analysis that depends on the structure of the kernel beyond its eigenvalue decay.

Machine Learning Algorithm Predicts Cancer Drug Efficacy


A big part of personalized medicine in cancer is knowing ahead of time if a drug is likely to be effective or not. That's usually done by identifying actionable genetic mutations. But a team of researchers recently developed a potentially quicker and more consistent tool based on omics data: a machine learning algorithm that ranks drugs based on their anti-proliferative efficacy in cancer cells. Known as Drug Ranking Using Machine Learning (DRUML), the method was developed at Queen Mary University in London and is based on machine learning analysis of protein omics data in cancer cells. DRUML was created based on training responses of cancer cells to 412 cancer drugs to predict the most appropriate one to treat a particular cancer.

Stochastic Hard Thresholding Algorithms for AUC Maximization Machine Learning

In this paper, we aim to develop stochastic hard thresholding algorithms for the important problem of AUC maximization in imbalanced classification. The main challenge is the pairwise loss involved in AUC maximization. We overcome this obstacle by reformulating the U-statistics objective function as an empirical risk minimization (ERM), from which a stochastic hard thresholding algorithm (\texttt{SHT-AUC}) is developed. To our best knowledge, this is the first attempt to provide stochastic hard thresholding algorithms for AUC maximization with a per-iteration cost $\O(b d)$ where $d$ and $b$ are the dimension of the data and the minibatch size, respectively. We show that the proposed algorithm enjoys the linear convergence rate up to a tolerance error. In particular, we show, if the data is generated from the Gaussian distribution, then its convergence becomes slower as the data gets more imbalanced. We conduct extensive experiments to show the efficiency and effectiveness of the proposed algorithms.

Population structure-learned classifier for high-dimension low-sample-size class-imbalanced problem Machine Learning

The Classification on high-dimension low-sample-size data (HDLSS) is a challenging problem and it is common to have class-imbalanced data in most application fields. We term this as Imbalanced HDLSS (IHDLSS). Recent theoretical results reveal that the classification criterion and tolerance similarity are crucial to HDLSS, which emphasizes the maximization of within-class variance on the premise of class separability. Based on this idea, a novel linear binary classifier, termed Population Structure-learned Classifier (PSC), is proposed. The proposed PSC can obtain better generalization performance on IHDLSS by maximizing the sum of inter-class scatter matrix and intra-class scatter matrix on the premise of class separability and assigning different intercept values to majority and minority classes. The salient features of the proposed approach are: (1) It works well on IHDLSS; (2) The inverse of high dimensional matrix can be solved in low dimensional space; (3) It is self-adaptive in determining the intercept term for each class; (4) It has the same computational complexity as the SVM. A series of evaluations are conducted on one simulated data set and eight real-world benchmark data sets on IHDLSS on gene analysis. Experimental results demonstrate that the PSC is superior to the state-of-art methods in IHDLSS.

Two-step penalised logistic regression for multi-omic data with an application to cardiometabolic syndrome Machine Learning

Summary: Building classification models that predict a binary class label on the basis of high dimensional multi-omics datasets poses several challenges, due to the typically widely differing characteristics of the data layers in terms of number of predictors, type of data, and levels of noise. Previous research has shown that applying classical logistic regression with elastic-net penalty to these datasets can lead to poor results (Liu et al., 2018). We implement a two-step approach to multi-omic logistic regression in which variable selection is performed on each layer separately and a predictive model is then built using the variables selected in the first step. Here, our approach is compared to other methods that have been developed for the same purpose, and we adapt existing software for multi-omic linear regression (Zhao and Zucknick, 2020) to the logistic regression setting. Extensive simulation studies show that our approach should be preferred if the goal is to select as many relevant predictors as possible, as well as achieving prediction performances comparable to those of the best competitors. Our motivating example is a cardiometabolic syndrome dataset comprising eight'omic data types for 2 extreme phenotype groups (10 obese and 10 lipodystrophy individuals) and 185 blood donors. Our proposed approach allows us to identify features that characterise cardiometabolic syndrome at the molecular level.

Prediction of Cancer Microarray and DNA Methylation Data using Non-negative Matrix Factorization Machine Learning

Over the past few years, there has been a considerable spread of microarray technology in many biological patterns, particularly in those pertaining to cancer diseases like leukemia, prostate, colon cancer, etc. The primary bottleneck that one experiences in the proper understanding of such datasets lies in their dimensionality, and thus for an efficient and effective means of studying the same, a reduction in their dimension to a large extent is deemed necessary. This study is a bid to suggesting different algorithms and approaches for the reduction of dimensionality of such microarray datasets. This study exploits the matrix-like structure of such microarray data and uses a popular technique called Non-Negative Matrix Factorization (NMF) to reduce the dimensionality, primarily in the field of biological data. Classification accuracies are then compared for these algorithms. This technique gives an accuracy of 98%.