AITopics | Generative AI

Collaborating Authors

Generative AI

News Overviews Instructional Materials AI-Alerts Classics

Review for NeurIPS paper: CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

Neural Information Processing SystemsJan-23-2025, 01:46:14 GMT

Weaknesses: The benchmarking of the molecular VAE model does not include a null model so as to assess its performance compared to random sampling of chemical space. The results show that generating high-affinity ligands is more challenging for NSP9 but the authors provide no reasoning or discussion as to why this may be. Could this be an artifact of the available training data in regards to its size and range of affinities? In the section on novelty, the authors mention using Tanimoto similarity between molecular fingerprints but do not delineate the specific algorithm and parameters used for fingerprint generation. Previous studies have demonstrated that the calculated similarities between molecules can vary significantly between fingerprinting methods.

deep generative model, neurips paper, target-specific and selective drug design, (9 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.53)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.40)
Health & Medicine > Therapeutic Area > Immunology (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Review for NeurIPS paper: CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models

Neural Information Processing SystemsJan-23-2025, 01:26:01 GMT

This paper proposes a framework, called CogMol, to design a drug-like small molecule for specific targets, which was applied to the problem of designing molecules that bind to three proteins found in SARS-CoV-19. Reviewers raised various concerns and questions and author response largely resolved major criticisms. Overall, based on the technical novelty, experiments, and clarity in writing, this paper passes the bar of acceptance to NeurIPS as a technical paper. However, multiple reviewers expressed a concern about the possibility that readers over-interpret the results in the context of the current pandemic situation, because wet-lab validation experiments have not been performed, (which would be out of scope and not necessary for a ML conference paper.) Thus, it is strongly recommended that the authors revise the manuscript to explicitly state that no experimental validation has been performed and only in-silico binding conclusions can be drawn.

deep generative model, neurips paper, target-specific and selective drug design, (4 more...)

Neural Information Processing Systems

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.40)

Add feedback

Reasoning Language Models: A Blueprint

Besta, Maciej, Barth, Julia, Schreiber, Eric, Kubicek, Ales, Catarino, Afonso, Gerstenberger, Robert, Nyczyk, Piotr, Iff, Patrick, Li, Yueling, Houliston, Sam, Sternal, Tomasz, Copik, Marcin, Kwaśniewski, Grzegorz, Müller, Jürgen, Flis, Łukasz, Eberhard, Hannes, Niewiadomski, Hubert, Hoefler, Torsten

arXiv.org Artificial IntelligenceJan-23-2025

Reasoning language models (RLMs), also known as Large Reasoning Models (LRMs), such as OpenAI's o1 and o3, DeepSeek-V3, and Alibaba's QwQ, have redefined AI's problem-solving capabilities by extending LLMs with advanced reasoning mechanisms. Yet, their high costs, proprietary nature, and complex architectures - uniquely combining Reinforcement Learning (RL), search heuristics, and LLMs - present accessibility and scalability challenges. To address these, we propose a comprehensive blueprint that organizes RLM components into a modular framework, based on a survey and analysis of all RLM works. This blueprint incorporates diverse reasoning structures (chains, trees, graphs, and nested forms), reasoning strategies (e.g., Monte Carlo Tree Search, Beam Search), RL concepts (policy, value models and others), supervision schemes (Outcome-Based and Process-Based Supervision), and other related concepts (e.g., Test-Time Compute, Retrieval-Augmented Generation, agent tools). We also provide detailed mathematical formulations and algorithmic specifications to simplify RLM implementation. By showing how schemes like LLaMA-Berry, QwQ, Journey Learning, and Graph of Thoughts fit as special cases, we demonstrate the blueprint's versatility and unifying potential. To illustrate its utility, we introduce x1, a modular implementation for rapid RLM prototyping and experimentation. Using x1 and a literature review, we provide key insights, such as multi-phase training for policy and value models, and the importance of familiar training distributions. Finally, we discuss scalable RLM cloud deployments and we outline how RLMs can integrate with a broader LLM ecosystem. Our work demystifies RLM construction, democratizes advanced reasoning capabilities, and fosters innovation, aiming to mitigate the gap between "rich AI" and "poor AI" by lowering barriers to RLM design and experimentation.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2501.11223

Country:

Europe (1.00)
Asia > Middle East (0.67)
North America > United States > Minnesota (0.27)

Genre:

Workflow (1.00)
Research Report (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Education (1.00)
Leisure & Entertainment > Games (0.67)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Add feedback

The Role of Generative AI in Software Student CollaborAItion

Kiesler, Natalie, Smith, Jacqueline, Leinonen, Juho, Fox, Armando, MacNeil, Stephen, Ihantola, Petri

arXiv.org Artificial IntelligenceJan-23-2025

Collaboration is a crucial part of computing education. The increase Khan [28] has proposed an inspiring vision of how AI could in AI capabilities over the last couple of years is bound to profoundly help realize personalized individual tutors for every learner. Complementing affect all aspects of systems and software engineering, including this, an expert panel from 2020 [49] draws a scenario collaboration. In this position paper, we consider a scenario where where "AI supports orchestration of the multiple types of activities, AI agents would be able to take on any role in collaborative processes learning partners, and interaction patterns that can enrich a classroom". in computing education. We outline these roles, the activities We believe the possibilities are even broader, and to help and group dynamics that software development currently include, think about them, we propose a thought experiment that not only and discuss if and in what way AI could facilitate these roles and accommodates emerging practices and visions but also suggests activities. The goal of our work is to envision and critically examine new use cases in education that (to the best of our knowledge) have potential futures. We present scenarios suggesting how AI not yet been explored.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2501.14084

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany > Bavaria > Middle Franconia > Nuremberg (0.14)
(11 more...)

Genre: Instructional Material (1.00)

Industry:

Information Technology (0.68)
Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.42)

Add feedback

TFG-Flow: Training-free Guidance in Multimodal Generative Flow

Lin, Haowei, Li, Shanda, Ye, Haotian, Yang, Yiming, Ermon, Stefano, Liang, Yitao, Ma, Jianzhu

arXiv.org Artificial IntelligenceJan-23-2025

Given an unconditional generative model and a predictor for a target property (e.g., a classifier), the goal of training-free guidance is to generate samples with desirable target properties without additional training. As a highly efficient technique for steering generative models toward flexible outcomes, training-free guidance has gained increasing attention in diffusion models. However, existing methods only handle data in continuous spaces, while many scientific applications involve both continuous and discrete data (referred to as multimodality). Another emerging trend is the growing use of the simple and general flow matching framework in building generative foundation models, where guided generation remains under-explored. To address this, we introduce TFG-Flow, a novel training-free guidance method for multimodal generative flow. TFG-Flow addresses the curse-of-dimensionality while maintaining the property of unbiased sampling in guiding discrete variables. We validate TFG-Flow on four molecular design tasks and show that TFG-Flow has great potential in drug design by generating molecules with desired properties. Recent advancements in generative foundation models have demonstrated their increasing power across a wide range of domains (Reid et al., 2024; Achiam et al., 2023; Abramson et al., 2024). In particular, diffusion-based foundation models, such as Stable Diffusion (Esser et al., 2024) and SORA (Brooks et al., 2024) have achieved significant success, catalyzing a new wave of applications in areas such as art and science. As these models become more prevalent, a critical question arises: how can we steer these foundation models to achieve specific properties during inference time? One promising direction is using classifier-based guidance (Dhariwal & Nichol, 2021) or classifierfree guidance (Ho & Salimans, 2022), which typically necessitate training a specialized model for each conditioning signal (e.g., a noise-conditional classifier or a text-conditional denoiser). This resource-intensive and time-consuming process greatly limits their applicability. Recently, there has been growing interest in training-free guidance for diffusion models, which allows users to steer the generation process using an off-the-shelf differentiable target predictor without requiring additional model training (Ye et al., 2024). A target predictor can be any classifier, loss, or energy function used to score the quality of the generated samples. Training-free guidance offers a flexible and efficient means of customizing generation, holding the potential to transform the field of generative AI.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2501.14216

Country:

Asia > China (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Utilizing Evolution Strategies to Train Transformers in Reinforcement Learning

Lorenc, Matyáš

arXiv.org Artificial IntelligenceJan-23-2025

We explore a capability of evolution strategies to train an agent with its policy based on a transformer architecture in a reinforcement learning setting. We performed experiments using OpenAI's highly parallelizable evolution strategy to train Decision Transformer in Humanoid locomotion environment and in the environment of Atari games, testing the ability of this black-box optimization technique to train even such relatively large and complicated models (compared to those previously tested in the literature). We also proposed a method to aid the training by first pretraining the model before using the OpenAI-ES to train it further, and tested its effectiveness. The examined evolution strategy proved to be, in general, capable of achieving strong results and managed to obtain high-performing agents. Therefore, the pretraining was shown to be unnecessary; yet still, it helped us observe and formulate several further insights.

evolutionary algorithm, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2501.13883

Country:

North America > United States > New York > New York County > New York City (0.14)
Europe > Czechia > Prague (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
Africa > Togo (0.04)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Games > Computer Games (0.87)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Add feedback

Parameter-Efficient Fine-Tuning for Foundation Models

Zhang, Dan, Feng, Tao, Xue, Lilong, Wang, Yuandong, Dong, Yuxiao, Tang, Jie

arXiv.org Artificial IntelligenceJan-23-2025

This survey delves into the realm of Parameter-Efficient Fine-Tuning (PEFT) within the context of Foundation Models (FMs). PEFT, a cost-effective fine-tuning technique, minimizes parameters and computational complexity while striving for optimal downstream task performance. FMs, like ChatGPT, DALL-E, and LLaVA specialize in language understanding, generative tasks, and multimodal tasks, trained on diverse datasets spanning text, images, and videos. The diversity of FMs guides various adaptation strategies for PEFT. Therefore, this survey aims to provide a comprehensive overview of PEFT techniques applied to diverse FMs and address critical gaps in understanding the techniques, trends, and applications. We start by providing a detailed development of FMs and PEFT. Subsequently, we systematically review the key categories and core mechanisms of PEFT across diverse FMs to offer a comprehensive understanding of trends. We also explore the most recent applications across various FMs to demonstrate the versatility of PEFT, shedding light on the integration of systematic PEFT methods with a range of FMs. Furthermore, we identify potential research and development directions for improving PEFTs in the future. This survey provides a valuable resource for both newcomers and experts seeking to understand and use the power of PEFT across FMs. All reviewed papers are listed at \url{https://github.com/THUDM/Awesome-Parameter-Efficient-Fine-Tuning-for-Foundation-Models}.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.13787

Country: Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.93)

Industry:

Health & Medicine (0.67)
Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

Musk clashes with OpenAI's Altman over 500bn Stargate

Al JazeeraJan-22-2025, 23:15:50 GMT

Elon Musk is clashing with OpenAI CEO Sam Altman over the Stargate artificial intelligence (AI) infrastructure project touted by President Donald Trump, the latest in a feud between the two tech billionaires that started on OpenAI's board and is now testing Musk's influence with the new president. Trump on Tuesday had talked up a joint venture investing up to 500bn through a new partnership formed by OpenAI, the maker of ChatGPT, alongside Oracle and SoftBank. The new entity, Stargate, is already starting to build out data centres and the electricity generation needed for the further development of fast-evolving AI technology. Trump declared it "a resounding declaration of confidence in America's potential" under his new administration, with an initial private investment of 100bn that could reach five times that sum. But Musk, a close Trump adviser who helped bankroll his campaign and now leads a government cost-cutting initiative, questioned the value of the investment hours later.

altman, musk, openai, (13 more...)

Al Jazeera

Country:

North America > United States > Texas > Taylor County > Abilene (0.05)
North America > United States > Tennessee > Shelby County > Memphis (0.05)
North America > United States > California (0.05)
Europe > Switzerland (0.05)

Industry:

Energy (1.00)
Information Technology > Services (0.56)
Government > Regional Government > North America Government > United States Government (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Stargate Isn't a Victory for Trump

The Atlantic - TechnologyJan-22-2025, 23:15:04 GMT

Late yesterday afternoon, the president of the United States transformed, very briefly, into the comms guy for a new tech company. At a press conference capping his first full day back in the White House, Donald Trump stood beside three of the most influential executives in the world--Sam Altman of OpenAI, Larry Ellison of Oracle, and Masayoshi Son of SoftBank--and announced the Stargate Project, "the largest AI infrastructure project, by far, in history." Although Trump's rhetoric may seem to suggest otherwise, Stargate is not a new federal program but rather a private venture uniting these three companies with other leaders in the AI race, such as Microsoft and Nvidia. The new company--for which Son will serve as chairman and OpenAI will be in charge of operations--will spend a planned 500 billion over the next four years to build data centers, power plants, and other such digital infrastructure in the United States, all in hopes of developing ever more advanced AI models. Trump presented Stargate as a victory for his "America First" agenda, saying that it may "lead to something that could be the biggest of all"--an apparent reference to superintelligent machines.

large language model, machine learning, natural language, (23 more...)

The Atlantic - Technology

Country: North America > United States (1.00)

Industry:

Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.59)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)

Add feedback

Reviews: Policy Continuation with Hindsight Inverse Dynamics

Neural Information Processing SystemsJan-22-2025, 22:41:17 GMT

The paper presents a new approach for inverse dynamics learning which is extended to goal conditioned, multi-step inverse dynamics. The approach is combined with standard RL algorithms to solve multi-goal tasks such as the OpenAI Fetch environment. All reviewers liked the ideas presented in the paper and appreciated the contributions. The experiments were also well executed and the results are convincing. I am also convinced that the paper offers interesting aspects in the field of multi-goal RL and recommend this paper for a spotlight presentation.

hindsight inverse dynamic, policy continuation

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback