Collaborating Authors


Going to Extremes: Weakly Supervised Medical Image Segmentation


Medical image annotation is a major hurdle for developing precise and robust machine learning models. Annotation is expensive, time-consuming, and often requires expert knowledge, particularly in the medical field. Here, we suggest using minimal user interaction in the form of extreme point clicks to train a segmentation model which, in effect, can be used to speed up medical image annotation. An initial segmentation is generated based on the extreme points utilizing the random walker algorithm. This initial segmentation is then used as a noisy supervision signal to train a fully convolutional network that can segment the organ of interest, based on the provided user clicks.

Streaming System Design for Large-Scale Machine Learning Applications


In October of 2019 Crunchbase raised $30M in Series C financing from OMERS Ventures. Crunchbase is charging forward, focusing more deeply on the analysis of business signals for both private and public companies. Here at the Engineering Team, we have been working on the interesting challenge of detecting these high value business signals from various sources, such as Tweets and news articles. Some examples of important signals include funding rounds, acquisitions, and key leadership hires. Finding these signals the moment they are announced empowers our customers to make well-informed business decisions.

Top 5 Sources For Analytics and Machine Learning Datasets - GreatLearning


Machine learning becomes engaging when we face various challenges and thus finding suitable datasets relevant to the use case is essential. Flexibility refers to the number of tasks that it supports. For example, Microsoft's COCO( Common Objects in Context) is used for object classification, detection, and segmentation. Add a bunch of captions for the same, and we can use it as a dataset for an image caption generator as well. Well, when we are just starting, we shall be working with some of the small and standard machine learning datasets like the CIFAR-10, MNIS, Iris, etc.

Data excellence: Better data for better AI


IEEE Intelligent Systems 24, 2 (2009) In the decade since then, the research community have done a lot with quantity, but quality has been left behind 16. Data Quality is not only human error 20. Data Quality should consider context of use it is not easy to give Y/N answer for most of our AI tasks the answer typically depends on the context, on the task, on the usage, etc 21. Data Quality should include real world diversity it is not easy to give Y/N answer for most of our AI tasks the answer typically depends on the context, on the task, on the usage, etc disagreement is signal for diversity and should be included in AI training 22. Data Quality is difficult even with experts For prevention of malaria, use only in individuals traveling to malarious areas where CHLOROQUINE resistant P. falciparum MALARIA has not been reported.

Transcriptomic signatures across human tissues identify functional rare genetic variation


Every human genome contains tens of thousands of rare genetic variants—which include single nucleotide changes, insertions or deletions, and larger structural variants—and some may have a functional effect. Ferraro et al. examined data from individuals in the Genotype-Tissue Expression (GTEx) project for outliers across tissues caused by gene expression, splicing, and allele-specific expression. Single rare variants were observed that affected the expression and allele-specific expression of multiple genes and, in the case of a gene fusion event, splicing. Experimental and computational validation suggest that many individuals carry more than 50 rare variants that affect transcription in some way. Although most variants were predicted to not affect an individual's phenotype, a small percentage showed likely disease-related associations, emphasizing the importance of studying the impact of rare genetic variation on the transcriptome. Science , this issue p. [eaaz5900][1] ### INTRODUCTION The human genome contains tens of thousands of rare (minor allele frequency <1%) variants, some of which contribute to disease risk. Using 838 samples with whole-genome and multitissue transcriptome sequencing data in the Genotype-Tissue Expression (GTEx) project version 8, we assessed how rare genetic variants contribute to extreme patterns in gene expression (eOutliers), allelic expression (aseOutliers), and alternative splicing (sOutliers). We integrated these three signals across 49 tissues with genomic annotations to prioritize high-impact rare variants (RVs) that associate with human traits. ### RATIONALE Outlier gene expression aids in identifying functional RVs. Transcriptome sequencing provides diverse measurements beyond gene expression, including allele-specific expression and alternative splicing, which can provide additional insight into RV functional effects. ### RESULTS After identifying multitissue eOutliers, aseOutliers, and sOutliers, we found that outlier individuals of each type were significantly more likely to carry an RV near the corresponding gene. Among eOutliers, we observed strong enrichment of rare structural variants. sOutliers were particularly enriched for RVs that disrupted or created a splicing consensus sequence. aseOutliers provided the strongest enrichment signal when evaluated from just a single tissue. We developed Watershed, a probabilistic model for personal genome interpretation that improves over standard genomic annotation–based methods for scoring RVs by integrating these three transcriptomic signals from the same individual and replicates in an independent cohort. To assess whether outlier RVs identified in GTEx associate with traits, we evaluated these variants for association with diverse traits in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. We found that transcriptome-assisted prioritization identified RVs with larger trait effect sizes and were better predictors of effect size than genomic annotation alone. ### CONCLUSION With >800 genomes matched with transcriptomes across 49 tissues, we were able to study RVs that underlie extreme changes in the transcriptome. To capture the diversity of these extreme changes, we developed and integrated approaches to identify expression, allele-specific expression, and alternative splicing outliers, and characterized the RV landscape underlying each outlier signal. We demonstrate that personal genome interpretation and RV discovery is enhanced by using these signals. This approach provides a new means to integrate a richer set of functional RVs into models of genetic burden, improve disease gene identification, and enable the delivery of precision genomics. ![Figure][2] Transcriptomic signatures identify functional rare genetic variation. We identified genes in individuals that show outlier expression, allele-specific expression, or alternative splicing and assessed enrichment of nearby rare variation. We integrated these three outlier signals with genomic annotation data to prioritize functional RVs and to intersect those variants with disease loci to identify potential RV trait associations. Rare genetic variants are abundant across the human genome, and identifying their function and phenotypic impact is a major challenge. Measuring aberrant gene expression has aided in identifying functional, large-effect rare variants (RVs). Here, we expanded detection of genetically driven transcriptome abnormalities by analyzing gene expression, allele-specific expression, and alternative splicing from multitissue RNA-sequencing data, and demonstrate that each signal informs unique classes of RVs. We developed Watershed, a probabilistic model that integrates multiple genomic and transcriptomic signals to predict variant function, validated these predictions in additional cohorts and through experimental assays, and used them to assess RVs in the UK Biobank, the Million Veterans Program, and the Jackson Heart Study. Our results link thousands of RVs to diverse molecular effects and provide evidence to associate RVs affecting the transcriptome with human traits. [1]: /lookup/doi/10.1126/science.aaz5900 [2]: pending:yes

COVID-19 CT Analysis using Deep Learning


In the following sections, I will elaborate on how we rapidly built a COVID-19 solution using deep learning tools. The ideas and methods presented here can be used for any new virus or disease with imaging features in CT, especially in the initial phase when data is almost not available. CT scan include a series of slices (for those who are not familiar with CT read short explanation below). Since we had a very limited number of COVID-19 patient's scans, we decided to use 2D slices instead of 3D volume of each scan. This allowed us to multiple our data set and to overcome the first obstacle of a small dataset.

Developing Deep Learning Models for Pathology Analysis


Ahead of the 6th Digital Pathology & AI Congress: USA, Dr Saeed Hassanpour introduces us to the subject of his presentation: the opportunities and challenges in developing deep learning based tools for histology. In the last decade, there has been massive progress in the artificial intelligence (AI) field, particularly in the domain of deep learning. This progress presents new opportunities for various domains dealing with images, particularly medical imaging. At the Hassanpour lab, we are harnessing advances in AI to enable pathologists to analyze and understand their data. The applications are particularly applicable for histology images.

Machine Learning Training Data Annotation Types for AI in News & Media


AI in media making this industry operate with more automated tasks for better efficiency in the market. Using the computer vision or NLP/NLU, AI in news media makes the objects and languages recognition system possible for machines. Cogito provides the training data sets for AI in media and news to develop the visual perception based AI model or language based machine learning models. Media industry can well-utilize the power of face recognition system to detect the various types of faces captured into the images or videos while reporting or covering the important topics around the world. The landmark annotation technique is used to detect or recognize such faces through AI.

Artificial Intelligence (AI) Business Directory – Adaptive Toolbox


AI Business Directory is a list of key companies (including startups and big corporations) worldwide with products, services, and applications in the fields related to the Artificial Intelligence (AI). A registered user can submit a listing and maintain it for your own business. The listing service is free. Typical AI fields include, but not limited to: Machine Learning (ML), Deep Learning, Cognitive Computing, Natural Language Processing (NLP), Computer Vision, Pattern Recognition, Autonomous Agents and Multi-Agent Systems, Automated Planning and Scheduling, Robotics, Predictive Analytics, etc. Typical AI applications include, but not limited to: Smart Agriculture, Healthcare, Manufacturing, Smart Cities, Smart Grids, Smart Mobility, Smart Lighting, Smart Buildings, Smart Home, Autonomous Vehicles, Supply Chain and Logistics, Cybersecurity, etc.

Using Facial Landmarks for Overlaying Faces with Masks


Have you ever wondered how Instagram masks are fitting so perfectly on your face? Would you like to know how you can try to implement something similar by yourself? This post will help you with that! To remind you how important it is to wear a medical mask in the current COVID-19 pandemic, we will write a demo script that overlays your face captured from a camera with a virtual medical mask using facial landmarks. You won't only learn how this could be done with the help of computer vision, but also can try out different masks yourself.