AITopics | Niagara County

Collaborating Authors

Niagara County

From thermodynamics to protein design: Diffusion models for biomolecule generation towards autonomous protein engineering

Li, Wen-ran, Cadet, Xavier F., Medina-Ortiz, David, Davari, Mehdi D., Sowdhamini, Ramanathan, Damour, Cedric, Li, Yu, Miranville, Alain, Cadet, Frederic

arXiv.org Artificial IntelligenceJan-5-2025

Protein design with desirable properties has been a significant challenge for many decades. Generative artificial intelligence is a promising approach and has achieved great success in various protein generation tasks. Notably, diffusion models stand out for their robust mathematical foundations and impressive generative capabilities, offering unique advantages in certain applications such as protein design. In this review, we first give the definition and characteristics of diffusion models and then focus on two strategies: Denoising Diffusion Probabilistic Models and Score-based Generative Models, where DDPM is the discrete form of SGM. Furthermore, we discuss their applications in protein design, peptide generation, drug discovery, and protein-ligand interaction. Finally, we outline the future perspectives of diffusion models to advance autonomous protein design and engineering. The E(3) group consists of all rotations, reflections, and translations in three-dimensions. The equivariance on the E(3) group can keep the physical stability of the frame of each amino acid as much as possible, and we reflect on how to keep the diffusion model E(3) equivariant for protein generation.

artificial intelligence, diffusion model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.0268

Country:

Europe > France > Île-de-France > Paris > Paris (0.14)
Asia > India > Karnataka > Bengaluru (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(7 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education > Health & Safety > School Nutrition (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

SMI-Editor: Edit-based SMILES Language Model with Fragment-level Supervision

Zheng, Kangjie, Liang, Siyue, Yang, Junwei, Feng, Bin, Liu, Zequn, Ju, Wei, Xiao, Zhiping, Zhang, Ming

arXiv.org Artificial IntelligenceDec-7-2024

SMILES, a crucial textual representation of molecular structures, has garnered significant attention as a foundation for pre-trained language models (LMs). However, most existing pre-trained SMILES LMs focus solely on the single-token level supervision during pre-training, failing to fully leverage the substructural information of molecules. This limitation makes the pre-training task overly simplistic, preventing the models from capturing richer molecular semantic information. Moreover, during pre-training, these SMILES LMs only process corrupted SMILES inputs, never encountering any valid SMILES, which leads to a train-inference mismatch. To address these challenges, we propose SMI-Editor, a novel edit-based pre-trained SMILES LM. SMI-Editor disrupts substructures within a molecule at random and feeds the resulting SMILES back into the model, which then attempts to restore the original SMILES through an editing process. This approach not only introduces fragment-level training signals, but also enables the use of valid SMILES as inputs, allowing the model to learn how to reconstruct complete molecules from these incomplete structures. As a result, the model demonstrates improved scalability and an enhanced ability to capture fragment-level molecular information. Experimental results show that SMI-Editor achieves state-of-the-art performance across multiple downstream molecular tasks, and even outperforming several 3D molecular representation models.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.05569

Country:

North America > United States > New York > Niagara County > Niagara Falls (0.04)
Asia > Middle East > Jordan (0.04)
Asia > China > Sichuan Province > Chengdu (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area (0.93)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)

Add feedback

Self-Compositional Data Augmentation for Scientific Keyphrase Generation

Houbre, Mael, Boudin, Florian, Daille, Beatrice, Aizawa, Akiko

arXiv.org Artificial IntelligenceNov-6-2024

State-of-the-art models for keyphrase generation require large amounts of training data to achieve good performance. However, obtaining keyphrase-labeled documents can be challenging and costly. To address this issue, we present a self-compositional data augmentation method. More specifically, we measure the relatedness of training documents based on their shared keyphrases, and combine similar documents to generate synthetic samples. The advantage of our method lies in its ability to create additional training samples that keep domain coherence, without relying on external data or resources. Our results on multiple datasets spanning three different domains, demonstrate that our method consistently improves keyphrase generation. A qualitative analysis of the generated keyphrases for the Computer Science domain confirms this improvement towards their representativity property.

computational linguistic, keyphrase, proceedings, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3677389.3702504

2411.03039

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
(26 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Artificial Intelligence of Things: A Survey

Siam, Shakhrul Iman, Ahn, Hyunho, Liu, Li, Alam, Samiul, Shen, Hui, Cao, Zhichao, Shroff, Ness, Krishnamachari, Bhaskar, Srivastava, Mani, Zhang, Mi

arXiv.org Artificial IntelligenceOct-25-2024

The proliferation of the Internet of Things (IoT) such as smartphones, wearables, drones, and smart speakers, as well as the gigantic amount of data they capture, have revolutionized the way we work, live, and interact with the world. Equipped with sensing, computing, networking, and communication capabilities, these devices are able to collect, analyze and transmit a wide range of data including images, videos, audio, texts, wireless signals, physiological signals from individuals and the physical world. In recent years, advancements in Artificial Intelligence (AI), particularly in deep learning (DL)/deep neural network (DNN), foundation models, and Generative AI, have propelled the integration of AI with IoT, making the concept of Artificial Intelligence of Things (AIoT) a reality. The synergy between IoT and modern AI enhances decision making, improves human-machine interactions, and facilitates more efficient operations, making AIoT one of the most exciting and promising areas that have the potential to fundamentally transform how people perceive and interact with the world. As illustrated in Figure 1, at its core, AIoT is grounded on three key components: sensing, computing, and networking & communication.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3690639

2410.19998

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.27)
North America > United States > New York > New York County > New York City (0.14)
North America > United States > Massachusetts > Suffolk County > Boston (0.14)
(23 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Telecommunications (1.00)
Information Technology > Security & Privacy (1.00)
(13 more...)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Communications > Mobile (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RGDA-DDI: Residual graph attention network and dual-attention based framework for drug-drug interaction prediction

Zhou, Changjian, Zhang, Xin, Li, Jiafeng, Song, Jia, Xiang, Wensheng

arXiv.org Artificial IntelligenceAug-27-2024

Recent studies suggest that drug-drug interaction (DDI) prediction via computational approaches has significant importance for understanding the functions and co-prescriptions of multiple drugs. However, the existing silico DDI prediction methods either ignore the potential interactions among drug-drug pairs (DDPs), or fail to explicitly model and fuse the multi-scale drug feature representations for better prediction. In this study, we propose RGDA-DDI, a residual graph attention network (residual-GAT) and dual-attention based framework for drug-drug interaction prediction. A residual-GAT module is introduced to simultaneously learn multi-scale feature representations from drugs and DDPs. In addition, a dual-attention based feature fusion block is constructed to learn local joint interaction representations. A series of evaluation metrics demonstrate that the RGDA-DDI significantly improved DDI prediction performance on two public benchmark datasets, which provides a new insight into drug development.

prediction, representation, rgda-ddi, (13 more...)

arXiv.org Artificial Intelligence

2408.1531

Country:

Asia > China > Heilongjiang Province > Harbin (0.05)
North America > United States > Utah (0.04)
North America > United States > New York > Niagara County > Niagara Falls (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science (0.94)

Add feedback

A Review of Large Language Models and Autonomous Agents in Chemistry

Ramos, Mayk Caldas, Collison, Christopher J., White, Andrew D.

arXiv.org Artificial IntelligenceJun-26-2024

Large language models (LLMs) are emerging as a powerful tool in chemistry across multiple domains. In chemistry, LLMs are able to accurately predict properties, design new molecules, optimize synthesis pathways, and accelerate drug and material discovery. A core emerging idea is combining LLMs with chemistry-specific tools like synthesis planners and databases, leading to so-called "agents." This review covers LLMs' recent history, current capabilities, design, challenges specific to chemistry, and future directions. Particular attention is given to agents and their emergence as a cross-chemistry paradigm. Agents have proven effective in diverse domains of chemistry, but challenges remain. It is unclear if creating domain-specific versus generalist agents and developing autonomous pipelines versus "co-pilot" systems will accelerate chemistry. An emerging direction is the development of multi-agent systems using a human-in-the-loop approach. Due to the incredibly fast development of this field, a repository has been built to keep track of the latest studies: https://github.com/ur-whitelab/LLMs-in-science.

chemical information and modeling, decoder-only architecture, synthesis prediction, (17 more...)

arXiv.org Artificial Intelligence

2407.01603

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(14 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (1.00)
Research Report > Promising Solution (0.92)

Industry:

Materials > Chemicals (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Analysis of Atom-level pretraining with Quantum Mechanics (QM) data for Graph Neural Networks Molecular property models

Arjona-Medina, Jose, Nugmanov, Ramil

arXiv.org Artificial IntelligenceMay-27-2024

Despite the rapid and significant advancements in deep learning for Quantitative Structure-Activity Relationship (QSAR) models, the challenge of learning robust molecular representations that effectively generalize in real-world scenarios to novel compounds remains an elusive and unresolved task. This study examines how atom-level pretraining with quantum mechanics (QM) data can mitigate violations of assumptions regarding the distributional similarity between training and test data and therefore improve performance and generalization in downstream tasks. In the public dataset Therapeutics Data Commons (TDC), we show how pretraining on atom-level QM improves performance overall and makes the activation of the features distributes more Gaussian-like which results in a representation that is more robust to distribution shifts. To the best of our knowledge, this is the first time that hidden state molecular representations are analyzed to compare the effects of molecule-level and atom-level pretraining on QM data.

dataset, different training approach, graphormer network, (15 more...)

arXiv.org Artificial Intelligence

2405.14837

Country:

North America > United States > New York > Niagara County > Niagara Falls (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.34)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies

Chu, Zhixuan, Wang, Yan, Zhu, Feng, Yu, Lu, Li, Longfei, Gu, Jinjie

arXiv.org Artificial IntelligenceFeb-5-2024

The advent of large language models (LLMs) such as ChatGPT, PaLM, and GPT-4 has catalyzed remarkable advances in natural language processing, demonstrating human-like language fluency and reasoning capacities. This position paper introduces the concept of Professional Agents (PAgents), an application framework harnessing LLM capabilities to create autonomous agents with controllable, specialized, interactive, and professional-level competencies. We posit that PAgents can reshape professional services through continuously developed expertise. Our proposed PAgents framework entails a tri-layered architecture for genesis, evolution, and synergy: a base tool layer, a middle agent layer, and a top synergy layer. This paper aims to spur discourse on promising real-world applications of LLMs. We argue the increasing sophistication and integration of PAgents could lead to AI systems exhibiting professional mastery over complex domains, serving critical needs, and potentially achieving artificial general intelligence.

agent, arxiv preprint arxiv, language model, (14 more...)

arXiv.org Artificial Intelligence

2402.03628

Country:

Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)
(3 more...)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Enhancing Wind Speed and Wind Power Forecasting Using Shape-Wise Feature Engineering: A Novel Approach for Improved Accuracy and Robustness

Christian, Mulomba Mukendi, Kim, Yun Seon, Choi, Hyebong, Lee, Jaeyoung, You, SongHee

arXiv.org Artificial IntelligenceJan-16-2024

Accurate prediction of wind speed and power is vital for enhancing the efficiency of wind energy systems. Numerous solutions have been implemented to date, demonstrating their potential to improve forecasting. Among these, deep learning is perceived as a revolutionary approach in the field. However, despite their effectiveness, the noise present in the collected data remains a significant challenge. This noise has the potential to diminish the performance of these algorithms, leading to inaccurate predictions. In response to this, this study explores a novel feature engineering approach. This approach involves altering the data input shape in both Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) and Autoregressive models for various forecasting horizons. The results reveal substantial enhancements in model resilience against noise resulting from step increases in data. The approach could achieve an impressive 83% accuracy in predicting unseen data up to the 24th steps. Furthermore, this method consistently provides high accuracy for short, mid, and long-term forecasts, outperforming the performance of individual models. These findings pave the way for further research on noise reduction strategies at different forecasting horizons through shape-wise feature engineering.

forecasting, pred ob, prediction, (9 more...)

arXiv.org Artificial Intelligence

doi: 10.17703/IJACT.2023.11.4.393

2401.08233

Country:

Asia > South Korea (0.14)
Asia > India (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
(2 more...)

Genre:

Research Report > Promising Solution (0.71)
Overview > Innovation (0.71)

Industry: Energy > Renewable > Wind (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Arabic Handwritten Text Line Dataset

Bouchal, Hakim, Belaid, Ahror

arXiv.org Artificial IntelligenceDec-10-2023

Segmentation of Arabic manuscripts into lines of text and words is an important step to make recognition systems more efficient and accurate. The problem of segmentation into text lines is solved since there are carefully annotated dataset dedicated to this task. However, To the best of our knowledge, there are no dataset annotating the word position of Arabic texts. In this paper, we present a new dataset specifically designed for historical Arabic script in which we annotate position in word level.

dataset, text line, university, (14 more...)

arXiv.org Artificial Intelligence

2312.07573

Country:

Africa > Middle East > Algeria > Béjaïa Province > Béjaïa (0.06)
North America > United States > New York > Niagara County > Niagara Falls (0.05)
North America > United States > Massachusetts > Suffolk County > Boston (0.05)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Vision > Handwriting Recognition (0.52)

Add feedback