AITopics | progen

Collaborating Authors

progen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ProGen: Revisiting Probabilistic Spatial-Temporal Time Series Forecasting from a Continuous Generative Perspective Using Stochastic Differential Equations

Gong, Mingze, Chen, Lei, Li, Jia

arXiv.org Machine LearningNov-2-2024

Accurate forecasting of spatiotemporal data remains challenging due to complex spatial dependencies and temporal dynamics. The inherent uncertainty and variability in such data often render deterministic models insufficient, prompting a shift towards probabilistic approaches, where diffusion-based generative models have emerged as effective solutions. In this paper, we present ProGen, a novel framework for probabilistic spatiotemporal time series forecasting that leverages Stochastic Differential Equations (SDEs) and diffusion-based generative modeling techniques in the continuous domain. By integrating a novel denoising score model, graph neural networks, and a tailored SDE, ProGen provides a robust solution that effectively captures spatiotemporal dependencies while managing uncertainty. Our extensive experiments on four benchmark traffic datasets demonstrate that ProGen outperforms state-of-the-art deterministic and probabilistic models. This work contributes a continuous, diffusion-based generative approach to spatiotemporal forecasting, paving the way for future research in probabilistic modeling and stochastic processes.

artificial intelligence, forecasting, machine learning, (14 more...)

arXiv.org Machine Learning

2411.01267

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.71)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation

Gholami, Mohsen, Akbari, Mohammad, Hu, Cindy, Masrani, Vaden, Wang, Z. Jane, Zhang, Yong

arXiv.org Artificial IntelligenceMar-28-2024

Knowledge distillation from LLMs is essential for the efficient deployment of language models. Prior works have proposed data generation using LLMs for preparing distilled models. We argue that generating data with LLMs is prone to sampling mainly from the center of original content distribution. This limitation hinders the distilled model from learning the true underlying data distribution and to forget the tails of the distributions (samples with lower probability). To this end, we propose GOLD, a task-agnostic data generation and knowledge distillation framework, which employs an iterative out-of-distribution-guided feedback mechanism for the LLM. As a result, the generated data improves the generalizability of distilled models. An energy-based OOD evaluation approach is also introduced to deal with noisy generated data. Our extensive experiments on 10 different classification and sequence-to-sequence tasks in NLP show that GOLD respectively outperforms prior arts and the LLM with an average improvement of 5% and 14%. We will also show that the proposed method is applicable to less explored and novel tasks. The code is available.

dataset, gold, llm, (17 more...)

arXiv.org Artificial Intelligence

2403.19754

Country:

Europe > France (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Asia > Middle East > Iraq (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

AI Has Successfully Imitated Human Evolution--and Might Do It Even Better

#artificialintelligenceFeb-9-2023, 11:05:30 GMT

Artificial intelligence is a master of imitation. Every time scientists design an AI--whether to mimic human language or master a game like chess--it either matches or far exceeds the capabilities of its biological creators. Now, AI has proven that it can even master the art of biology itself. Researchers at the University of California-San Francisco, the University of California-Berkeley, and Salesforce Research, a science arm of the SF-based software company, developed an AI capable of copying evolution itself. This doesn't mean the AI created some sort of evolutionary superior superhuman (yet), but instead, the AI designed sequences of 20 amino acids that make up proteins.

artificial intelligence, natural language, protein, (7 more...)

#artificialintelligence

Country:

North America > United States > California > San Francisco County > San Francisco (0.58)
North America > United States > California > Alameda County > Berkeley (0.26)

Industry:

Information Technology > Software (0.89)
Health & Medicine > Pharmaceuticals & Biotechnology (0.63)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.84)

Add feedback

AI Has Successfully demonstrated Human Evolution - BLOCKGENI

#artificialintelligenceFeb-2-2023, 12:20:33 GMT

An AI that can mimic evolution itself was created by researchers at the Universities of California, San Francisco, Berkeley, and Salesforce Research, the science division of the software firm based in San Francisco. This doesn't mean the AI produced a kind of superhuman evolutionarily superior, however; rather, it constructed the protein-building sequences of 20 amino acids. Some of the sequences performed equally well when compared to those produced by millions of years of evolution, which is nature's workmanship. It's interesting that researchers didn't create an AI from scratch but rather modified a language model from a different subject. The "sentences" of biological proteins, which are essentially a language of amino acids, were the focus of the study, which made use of Salesforce's ProGen natural language processing capabilities.

amino acid, protein, sequence, (6 more...)

#artificialintelligence

Country: North America > United States > California > San Francisco County > San Francisco (0.81)

Genre: Research Report (0.38)

Industry:

Information Technology > Software (0.88)
Health & Medicine > Pharmaceuticals & Biotechnology (0.85)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

This Week's Awesome Tech Stories From Around the Web (Through January 28)

#artificialintelligenceJan-28-2023, 21:35:23 GMT

AI Has Designed Bacteria-Killing Proteins From Scratch--and They Work Karmela Padavic-Callaghan New Scientist "The AI, called ProGen, works in a similar way to AIs that can generate text. ProGen learned how to generate new proteins by learning the grammar of how amino acids combine to form 280 million existing proteins. Instead of the researchers choosing a topic for the AI to write about, they could specify a group of similar proteins for it to focus on. In this case, they chose a group of proteins with antimicrobial activity." BuzzFeed to Use ChatGPT Creator OpenAI to Help Create Quizzes and Other Content Alexandra Bruell The Wall Street Journal "BuzzFeed Inc. said it would rely on ChatGPT creator OpenAI to enhance its quizzes and personalize some content for its audiences, becoming the latest digital publisher to embrace artificial intelligence. In a memo to staff sent Thursday morning, which was reviewed by The Wall Street Journal, Chief Executive Jonah Peretti said he intends for AI to play a larger role in the company's editorial and business operations this year."

large language model, machine learning, natural language, (14 more...)

#artificialintelligence

Industry:

Media (0.56)
Health & Medicine > Pharmaceuticals & Biotechnology (0.37)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.58)

Add feedback

AI has designed bacteria-killing proteins from scratch – and they work

New ScientistJan-26-2023, 16:00:51 GMT

An AI has designed anti-microbial proteins that were then tested in real life and shown to work. The same approach could eventually be used to make new medicines. Proteins are made of chains of amino acids. The sequence of those acids determine the protein's shape and function. Ali Madani at Salesforce Research in California and his colleagues used an AI to design millions of new proteins, then created a small sample of those to test whether they worked.

artificial intelligence, madani, protein, (8 more...)

New Scientist

Country: North America > United States > California > San Francisco County > San Francisco (0.18)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

ProGen: Progressive Zero-shot Dataset Generation via In-context Feedback

Ye, Jiacheng, Gao, Jiahui, Feng, Jiangtao, Wu, Zhiyong, Yu, Tao, Kong, Lingpeng

arXiv.org Artificial IntelligenceOct-21-2022

Recently, dataset-generation-based zero-shot learning has shown promising results by training a task-specific model with a dataset synthesized from large pre-trained language models (PLMs). The final task-specific model often achieves compatible or even better performance than PLMs under the zero-shot setting, with orders of magnitude fewer parameters. However, synthetic datasets have their drawbacks. They have long been suffering from low-quality issues (e.g., low informativeness and redundancy). This explains why the massive synthetic data does not lead to better performance -- a scenario we would expect in the human-labeled data. To improve the quality of dataset synthesis, we propose a progressive zero-shot dataset generation framework, ProGen, which leverages the feedback from the task-specific model to guide the generation of new training data via in-context examples. Extensive experiments on five text classification datasets demonstrate the effectiveness of the proposed approach. We also show ProGen achieves on-par or superior performance with only 1\% synthetic dataset size compared to baseline methods without in-context feedback.

large language model, natural language, progressive zero-shot dataset generation, (3 more...)

arXiv.org Artificial Intelligence

2210.12329

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ProGen: Language Modeling for Protein Generation

Madani, Ali, McCann, Bryan, Naik, Nikhil, Keskar, Nitish Shirish, Anand, Namrata, Eguchi, Raphael R., Huang, Po-Ssu, Socher, Richard

arXiv.org Machine LearningMar-7-2020

Generative modeling for protein engineering is key to solving fundamental problems in synthetic biology, medicine, and material science. We pose protein engineering as an unsupervised sequence generation problem in order to leverage the exponentially growing set of proteins that lack costly, structural annotations. We train a 1.2B-parameter language model, ProGen, on ~280M protein sequences conditioned on taxonomic and keyword tags such as molecular function and cellular component. This provides ProGen with an unprecedented range of evolutionary sequence diversity and allows it to generate with fine-grained control as demonstrated by metrics based on primary sequence similarity, secondary structure accuracy, and conformational energy.

progen, protein, sequence, (10 more...)

arXiv.org Machine Learning

2004.03497

Country:

North America > United States > California > Santa Clara County > Stanford (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.66)

Add feedback