AITopics

2506.01376

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Joeres, Roman, Bojar, Daniel

Higher-Order Message Passing for Glycan Representation Learning

arXiv.org Artificial IntelligenceOct-22-2024

Glycans are the most complex biological sequence, with monosaccharides forming extended, non-linear sequences. As post-translational modifications, they modulate protein structure, function, and interactions. Due to their diversity and complexity, predictive models of glycan properties and functions are still insufficient. Graph Neural Networks (GNNs) are deep learning models designed to process and analyze graph-structured data. These architectures leverage the connectivity and relational information in graphs to learn effective representations of nodes, edges, and entire graphs. Iteratively aggregating information from neighboring nodes, GNNs capture complex patterns within graph data, making them particularly well-suited for tasks such as link prediction or graph classification across domains. This work presents a new model architecture based on combinatorial complexes and higher-order message passing to extract features from glycan structures into a latent space representation. The architecture is evaluated on an improved GlycanML benchmark suite, establishing a new state-of-the-art performance. We envision that these improvements will spur further advances in computational glycosciences and reveal the roles of glycans in biology.

artificial intelligence, data mining, machine learning, (19 more...)

2409.13467

Country:

Europe > Germany > Saarland > Saarbrücken (0.14)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > Greece (0.04)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.66)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceMay-25-2024

GlycanML: A Multi-Task and Multi-Structure Benchmark for Glycan Machine Learning

Xu, Minghao, Geng, Yunteng, Zhang, Yihang, Yang, Ling, Tang, Jian, Zhang, Wentao

Glycans are basic biomolecules and perform essential functions within living organisms. The rapid increase of functional glycan data provides a good opportunity for machine learning solutions to glycan understanding. However, there still lacks a standard machine learning benchmark for glycan function prediction. In this work, we fill this blank by building a comprehensive benchmark for Glycan Machine Learning (GlycanML). The GlycanML benchmark consists of diverse types of tasks including glycan taxonomy prediction, glycan immunogenicity prediction, glycosylation type prediction, and protein-glycan interaction prediction. Glycans can be represented by both sequences and graphs in GlycanML, which enables us to extensively evaluate sequence-based models and graph neural networks (GNNs) on benchmark tasks. Furthermore, by concurrently performing eight glycan taxonomy prediction tasks, we introduce the GlycanML-MTL testbed for multi-task learning (MTL) algorithms. Experimental results show the superiority of modeling glycans with multi-relational GNNs, and suitable MTL methods can further boost model performance. We provide all datasets and source codes at https://github.com/GlycanML/GlycanML and maintain a leaderboard at https://GlycanML.github.io/project

glycan, learning, prediction, (15 more...)

2405.16206

Country:

North America > Canada > Quebec > Montreal (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Greece (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.94)
Education (0.68)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

#artificialintelligenceSep-13-2021, 00:30:18 GMT

Artificial intelligence could revolutionise glycomics datasets

Researchers have created a tool that allows glycomics datasets to be analysed using artificial intelligence for early cancer diagnoses. A team at the University of California (UC) San Diego, US, have developed a tool called GlyCompare that enables researchers to analyse glycomics datasets using artificial intelligence (AI), potentially leading to early cancer diagnoses. GlyCompare takes a systems-level perspective that accounts for shared biosynthetic pathways of glycans within and across samples. According to the team, one of the keys to the GlyCompare approach is that it looks at the biological steps needed to synthesise the subunits that make up glycans, rather than only looking at only the whole glycans themselves, thereby improving the accuracy of statistical analyses of glycomics data. To introduce their technology, the team demonstrated their ability to enhance comparisons of glycomics datasets by focusing on the hidden relationships between glycans in several contexts, including gastric cancer tissues.

glycomic dataset, glycompare, intelligence, (7 more...)

Country: North America > United States > California > San Diego County > San Diego (0.32)

Genre: Research Report (0.37)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology: Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceJul-9-2021, 15:50:35 GMT

Graph Convolutional Neural Networks to Analyze Complex Carbohydrates

Graph convolutional neural networks (GCNs) have attracted increasing amounts of attention over the last couple of years, with more and more disciplines finding use for them. This has also been extended into the life sciences, as GCNs have been used to analyze proteins, drugs, and of course biological networks. One key advantage of GCNs that has enabled this expansion is their ability to natively work with nonlinear data formats, in contrast to more linear data structures such as in natural languages. Because of this feature, we also implemented GCNs for our own topic of interest, the study of complex carbohydrates or glycans. Glycans are ubiquitous in biology, decorating every cell and playing key roles in processes such as viral infection or tumor immune evasion.

glycan, influenza virus, virus, (12 more...)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

#artificialintelligenceJun-20-2021, 01:55:11 GMT

New AI model helps understand virus spread from animals to humans

The image shows a glimpse of glycan diversity, showcasing several classes of glycans from various kingdoms of life. A new model that applies artificial intelligence to carbohydrates improves the understanding of the infection process and could help predict which viruses are likely to spread from animals to humans. This is reported in a recent study led by researchers at the University of Gothenburg. Carbohydrates participate in nearly all biological processes - yet they are still not well understood. Referred to as glycans, these carbohydrates are crucial to making our body work the way it is supposed to.

daniel bojar, glycan, virus, (12 more...)

Country: Europe > Sweden > Vaestra Goetaland > Gothenburg (0.29)

Genre: Research Report (0.92)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.85)

Technology: Information Technology > Artificial Intelligence (1.00)

#artificialintelligenceJun-17-2021, 06:30:12 GMT

Spotlight on AI: Latest Developments in the Field of Artificial Intelligence

Artificial intelligence is changing the course of our lives with its constant developments. Before the pandemic and now in the new normal, AI remains to be a key trend in the tech industry. It is reaching wider audiences as years pass and scientists, engineers, and entrepreneurs who involve themselves with modern technologies are reaping the benefits of AI and its branches, IoT and machine learning. Organizations that overlooked digital transformation and the power of artificial intelligence are picking the pace of AI adoption. When COVID-19 was creating chaos across industries, it became evident that disruptive technologies and the automation that comes with it are more than crucial.

artificial intelligence, exscientia, latest development, (9 more...)

Country: Europe > Sweden > Vaestra Goetaland > Gothenburg (0.07)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.52)
Health & Medicine > Therapeutic Area > Immunology (0.52)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.38)

Mohapatra, Somesh, An, Joyce, Gómez-Bombarelli, Rafael

Chemistry-informed Macromolecule Graph Representation for Similarity Computation and Supervised Learning

arXiv.org Machine LearningMar-3-2021

Macromolecules are large, complex molecules composed of covalently bonded monomer units, existing in different stereochemical configurations and topologies. As a result of such chemical diversity, representing, comparing, and learning over macromolecules emerge as critical challenges. To address this, we developed a macromolecule graph representation, with monomers and bonds as nodes and edges, respectively. We captured the inherent chemistry of the macromolecule by using molecular fingerprints for node and edge attributes. For the first time, we demonstrated computation of chemical similarity between 2 macromolecules of varying chemistry and topology, using exact graph edit distances and graph kernels. We also trained graph neural networks for a variety of glycan classification tasks, achieving state-of-the-art results. Our work has two-fold implications - it provides a general framework for representation, comparison, and learning of macromolecules; and enables quantitative chemistry-informed decision-making and iterative design in the macromolecular chemical space. Macromolecules are ubiquitous and indispensable, from constituting what we are made up of to being present in almost everything we use. As biological macromolecules, they form the basis of life, serving as drivers of survival and growth functions. As synthetic macromolecules, humans have engineered the composition and topology to design structural components, sensors, shape-memory materials, drugs, encode messages, and much more (Lutz et al., 2016; Romio et al., 2020; Boydston et al., 2020; Thompson & Korley, 2020).

glycan, model architecture, roc-auc curve, (15 more...)

arXiv.org Machine Learning

2103.02565

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
North America > United States > California > Los Angeles County > Pasadena (0.04)
Europe > France (0.04)

Genre: Research Report (0.40)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

arXiv.org Artificial IntelligenceDec-28-2015

Mining Massive Hierarchical Data Using a Scalable Probabilistic Graphical Model

AlJadda, Khalifeh, Korayem, Mohammed, Ortiz, Camilo, Grainger, Trey, Miller, John A., Rasheed, Khaled, Kochut, Krys J., York, William S., Ranzinger, Rene, Porterfield, Melody

Probabilistic Graphical Models (PGM) are very useful in the fields of machine learning and data mining. The crucial limitation of those models,however, is the scalability. The Bayesian Network, which is one of the most common PGMs used in machine learning and data mining, demonstrates this limitation when the training data consists of random variables, each of them has a large set of possible values. In the big data era, one would expect new extensions to the existing PGMs to handle the massive amount of data produced these days by computers, sensors and other electronic devices. With hierarchical data - data that is arranged in a treelike structure with several levels - one would expect to see hundreds of thousands or millions of values distributed over even just a small number of levels. When modeling this kind of hierarchical data across large data sets, Bayesian Networks become infeasible for representing the probability distributions. In this paper we introduce an extension to Bayesian Networks to handle massive sets of hierarchical data in a reasonable amount of time and space. The proposed model achieves perfect precision of 1.0 and high recall of 0.93 when it is used as multi-label classifier for the annotation of mass spectrometry data. On another data set of 1.5 billion search logs provided by CareerBuilder.com the model was able to predict latent semantic relationships between search keywords with accuracy up to 0.80.

data mining, machine learning, pgmhd, (20 more...)

1512.08525

Country: North America > United States > Georgia > Clarke County > Athens (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

AlJadda, Khalifeh, Korayem, Mohammed, Ortiz, Camilo, Grainger, Trey, Miller, John A., York, William S.

PGMHD: A Scalable Probabilistic Graphical Model for Massive Hierarchical Data Problems

arXiv.org Artificial IntelligenceAug-19-2014

In the big data era, scalability has become a crucial requirement for any useful computational model. Probabilistic graphical models are very useful for mining and discovering data insights, but they are not scalable enough to be suitable for big data problems. Bayesian Networks particularly demonstrate this limitation when their data is represented using few random variables while each random variable has a massive set of values. With hierarchical data - data that is arranged in a treelike structure with several levels - one would expect to see hundreds of thousands or millions of values distributed over even just a small number of levels. When modeling this kind of hierarchical data across large data sets, Bayesian networks become infeasible for representing the probability distributions for the following reasons: i) Each level represents a single random variable with hundreds of thousands of values, ii) The number of levels is usually small, so there are also few random variables, and iii) The structure of the network is predefined since the dependency is modeled top-down from each parent to each of its child nodes, so the network would contain a single linear path for the random variables from each parent to each child node. In this paper we present a scalable probabilistic graphical model to overcome these limitations for massive hierarchical data. We believe the proposed model will lead to an easily-scalable, more readable, and expressive implementation for problems that require probabilistic-based solutions for massive amounts of hierarchical data. We successfully applied this model to solve two different challenging probabilistic-based problems on massive hierarchical data sets for different domains, namely, bioinformatics and latent semantic discovery over search logs.

bayesian network, data mining, machine learning, (20 more...)

1407.5656

Country: North America > United States > Georgia > Clarke County > Athens (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)