cancer sample
Boffins build AI to identify genetic mutations • The Register
Machine learning techniques, such as deep learning, have proven surprisingly effective at identifying diseases like breast cancer. However, when it comes to identifying mutations at the genetic level, these models have come up short, according to researchers at the University of California San Diego (UCSD). In a paper published in the journal Nature Biotechnology this week, researchers at the university propose a new machine learning framework called DeepMosaic that uses a combination of image-based visualization and deep learning models to identify genetic mutations associated with diseases including cancer and disorders with genetic links, such as autism spectrum disorder. Using AI/ML to identify disease has been a hot topic in recent years. The problem, according to UCSD professor Joe Gleeson, is most of these models aren't well suited to identifying genetic mutations, called mosaic variants or mutations, because most of the software developed over the last two decades was trained on cancer samples. Because cancer cells divide so rapidly, they're relatively easy to spot for computer programs, he explained in an interview with The Register.
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Therapeutic Area > Genetic Disease (1.00)
- Health & Medicine > Therapeutic Area > Neurology > Autism (0.56)
How machine learning is helping patients diagnosed with the most common childhood cancer
New software developed by Peter Mac and collaborators is helping patients diagnosed with acute lymphoblastic leukemia (ALL) to determine what subtype they have. ALL is the most common childhood cancer in the world, and also affects adults. "Thirty to forty percent of all childhood cancers are ALL, it's a major pediatric cancer problem," says Associate Professor Paul Ekert from Peter Mac and the Children's Cancer Institute, who was involved in this work. More than 300 people are diagnosed with the disease in Australia each year, and more than half of those are young children under the age of 15. Determining what subtype of ALL a patient has provides valuable information about their prognosis, and how they should best be treated.
- Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Childhood Cancer (1.00)
A molecular generative model with genetic algorithm and tree search for cancer samples
Personalized medicine is expected to maximize the intended drug effects and minimize side effects by treating patients based on their genetic profiles. Thus, it is important to generate drugs based on the genetic profiles of diseases, especially in anticancer drug discovery. However, this is challenging because the vast chemical space and variations in cancer properties require a huge time resource to search for proper molecules. Therefore, an efficient and fast search method considering genetic profiles is required for de novo molecular design of anticancer drugs. Here, we propose a faster molecular generative model with genetic algorithm and tree search for cancer samples (FasterGTS). FasterGTS is constructed with a genetic algorithm and a Monte Carlo tree search with three deep neural networks: supervised learning, self-trained, and value networks, and it generates anticancer molecules based on the genetic profiles of a cancer sample. When compared to other methods, FasterGTS generated cancer sample-specific molecules with general chemical properties required for cancer drugs within the limited numbers of samplings. We expect that FasterGTS contributes to the anticancer drug generation.
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Artificial Intelligence-Powered Electronic 'Nose' Can Accurately Sniff Out Cancers
A new sensor may be able to detect cancer by'sniffing' blood samples. An odor based test that detects vapors from human blood plasma samples was able to tell the difference between benign and cancerous cells with up to 95% accuracy, according to work presented this week at the American Society of Clinical Oncology meeting in Philadelphia. The study was led by scientists at the University of Pennsylvania and Penn Perelman School of Medicine and utilizes artificial intelligence (AI) and machine learning to analyze molecules called'volatile organic compounds,' (VOCs). These are released from cells in blood and tissues and the'electronic nose' contains nanosensors which are calibrated to detect VOCs. The researchers took samples from 20 patients with ovarian cancer, 20 with non-cancerous ovarian tumors and 20 people who had no tumors at all and found that the electronic nose could tell apart the ovarian cancer samples with a 95% accuracy.
Bridging the Generalization Gap: Training Robust Models on Confounded Biological Data
Liu, Tzu-Yu, Kannan, Ajay, Drake, Adam, Bertin, Marvin, Wan, Nathan
Statistical learning on biological data can be challenging due to confounding variables in sample collection and processing. Confounders can cause models to generalize poorly and result in inaccurate prediction performance metrics if models are not validated thoroughly. In this paper, we propose methods to control for confounding factors and further improve prediction performance. We introduce OrthoNormal basis construction In cOnfounding factor Normalization (ONION) to remove confounding covariates and use the Domain-Adversarial Neural Network (DANN) to penalize models for encoding confounder information. We apply the proposed methods to simulated and empirical patient data and show significant improvements in generalization.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > San Mateo County > South San Francisco (0.04)
- North America > Canada > Quebec > Montreal (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
The chromatin accessibility landscape of primary human cancers
The Cancer Genome Atlas (TCGA) provides a high-quality resource of molecular data on a large variety of human cancers. Corces et al. used a recently modified assay to profile chromatin accessibility to determine the accessible chromatin landscape in 410 TCGA samples from 23 cancer types (see the Perspective by Taipale). When the data were integrated with other omics data available for the same tumor samples, inherited risk loci for cancer predisposition were revealed, transcription factors and enhancers driving molecular subtypes of cancer with patient survival differences were identified, and noncoding mutations associated with clinical prognosis were discovered. Science, this issue p. eaav1898; see also p. 401 Cancer is one of the leading causes of death worldwide. Although the 2% of the human genome that encodes proteins has been extensively studied, much remains to be learned about the noncoding genome and gene regulation in cancer. Genes are turned on and off in the proper cell types and cell states by transcription factor (TF) proteins acting on DNA regulatory elements that are scattered over the vast noncoding genome and exert long-range influences. The Cancer Genome Atlas (TCGA) is a global consortium that aims to accelerate the understanding of the molecular basis of cancer. TCGA has systematically collected DNA mutation, methylation, RNA expression, and other comprehensive datasets from primary human cancer tissue. TCGA has served as an invaluable resource for the identification of genomic aberrations, altered transcriptional networks, and cancer subtypes. Nonetheless, the gene regulatory landscapes of these tumors have largely been inferred through indirect means. A hallmark of active DNA regulatory elements is chromatin accessibility. Eukaryotic genomes are compacted in chromatin, a complex of DNA and proteins, and only the active regulatory elements are accessible by the cell's machinery such as TFs. ATAC-seq enables the genome-wide profiling of TF binding events that orchestrate gene expression programs and give a cell its identity. We generated high-quality ATAC-seq data in 410 tumor samples from TCGA, identifying diverse regulatory landscapes across 23 cancer types. These chromatin accessibility profiles identify cancer- and tissue-specific DNA regulatory elements that enable classification of tumor subtypes with newly recognized prognostic importance. We identify distinct TF activities in cancer based on differences in the inferred patterns of TF-DNA interaction and gene expression. Genome-wide correlation of gene expression and chromatin accessibility predicts tens of thousands of putative interactions between distal regulatory elements and gene promoters, including key oncogenes and targets in cancer immunotherapy, such as MYC, SRC, BCL2, and PDL1.
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Carcinoma (0.94)
Dealing with Unbalanced Classes in Machine Learning - deep ideas
In many real-world classification problems, we stumble upon training data with unbalanced classes. This means that the individual classes do not contain the same number of elements. For example, if we want to build an image-based skin cancer detection system using convolutional neural networks, we might encounter a dataset with about 95% negatives and 5% positives. This is for good reasons: Images associated with a negative diagnosis are way more common than images with a positive diagnosis. Rather than regarding this as a flaw in the dataset, we should leverage the additional information that we get.