Goto

Collaborating Authors

 Generative AI


Evolution of Collective AI Beyond Individual Optimization

arXiv.org Artificial Intelligence

Artificial Intelligence (AI) has witnessed significant advances with the emergence of powerful neural network (NN) models. Examples include large language models [1] and image generation models such as DALL-E [2], Imagen [3], and Parti [4]. Each has achieved previously unseen capabilities as powerful individuals through recent technical breakthroughs. On the other hand, the biological evolutionary strategy focuses more on the direction of collective intelligence compared to individual ability, especially for species living in populations [5]. Unlike individual intelligence, which deals with challenges independently, collective intelligence necessitates the ability to process information, operate in a decentralized manner, and adaptively integrate information based on context. This distinction is evident in social insects, such as ants and bees, where collective behavior with role differentiation emerges not from highly complex individuals but through simple interactions among members.


AI in Education: Rationale, Principles, and Instructional Implications

arXiv.org Artificial Intelligence

This study examines the integration of generative AI in schools, assessing its benefits and risks. As AI use by students grows, it's crucial to understand its impact on learning and teaching practices. Generative AI, like ChatGPT, can create human-like content, prompting questions about its educational role. The article differentiates large language models from traditional search engines and stresses the need for students to develop critical source evaluation skills. Although empirical evidence on AI's classroom effects is limited, AI offers personalized learning support and problem-solving tools, alongside challenges like undermining deep learning if misused. The study emphasizes deliberate strategies to ensure AI complements, not replaces, genuine cognitive effort. AI's educational role should be context-dependent, guided by pedagogical goals. The study concludes with practical advice for teachers on effectively utilizing AI to promote understanding and critical engagement, advocating for a balanced approach to enhance students' knowledge and skills development.


Diffusion models learn distributions generated by complex Langevin dynamics

arXiv.org Artificial Intelligence

The probability distribution effectively sampled by a complex Langevin process for theories with a sign problem is not known a priori and notoriously hard to understand. Diffusion models, a class of generative AI, can learn distributions from data. In this contribution, we explore the ability of diffusion models to learn the distributions created by a complex Langevin process.


Future of Information Retrieval Research in the Age of Generative AI

arXiv.org Artificial Intelligence

In the fast-evolving field of information retrieval (IR), the integration of generative AI technologies such as large language models (LLMs) is transforming how users search for and interact with information. Recognizing this paradigm shift at the intersection of IR and generative AI (IR-GenAI), a visioning workshop supported by the Computing Community Consortium (CCC) was held in July 2024 to discuss the future of IR in the age of generative AI. This workshop convened 44 experts in information retrieval, natural language processing, human-computer interaction, and artificial intelligence from academia, industry, and government to explore how generative AI can enhance IR and vice versa, and to identify the major challenges and opportunities in this rapidly advancing field. This report contains a summary of discussions as potentially important research topics and contains a list of recommendations for academics, industry practitioners, institutions, evaluation campaigns, and funding agencies.


Second FRCSyn-onGoing: Winning Solutions and Post-Challenge Analysis to Improve Face Recognition with Synthetic Data

arXiv.org Artificial Intelligence

Synthetic data is gaining increasing popularity for face recognition technologies, mainly due to the privacy concerns and challenges associated with obtaining real data, including diverse scenarios, quality, and demographic groups, among others. It also offers some advantages over real data, such as the large amount of data that can be generated or the ability to customize it to adapt to specific problem-solving needs. To effectively use such data, face recognition models should also be specifically designed to exploit synthetic data to its fullest potential. In order to promote the proposal of novel Generative AI methods and synthetic data, and investigate the application of synthetic data to better train face recognition systems, we introduce the 2nd FRCSyn-onGoing challenge, based on the 2nd Face Recognition Challenge in the Era of Synthetic Data (FRCSyn), originally launched at CVPR 2024. This is an ongoing challenge that provides researchers with an accessible platform to benchmark i) the proposal of novel Generative AI methods and synthetic data, and ii) novel face recognition systems that are specifically proposed to take advantage of synthetic data. We focus on exploring the use of synthetic data both individually and in combination with real data to solve current challenges in face recognition such as demographic bias, domain adaptation, and performance constraints in demanding situations, such as age disparities between training and testing, changes in the pose, or occlusions. Very interesting findings are obtained in this second edition, including a direct comparison with the first one, in which synthetic databases were restricted to DCFace and GANDiffFace.


An overview of diffusion models for generative artificial intelligence

arXiv.org Artificial Intelligence

This article provides a mathematically rigorous introduction to denoising diffusion probabilistic models (DDPMs), sometimes also referred to as diffusion probabilistic models or diffusion models, for generative artificial intelligence. We provide a detailed basic mathematical framework for DDPMs and explain the main ideas behind training and generation procedures. In this overview article we also review selected extensions and improvements of the basic framework from the literature such as improved DDPMs, denoising diffusion implicit models, classifier-free diffusion guidance models, and latent diffusion models.


Probabilistic Analysis of Copyright Disputes and Generative AI Safety

arXiv.org Artificial Intelligence

This paper presents a probabilistic approach to analyzing copyright infringement disputes by formalizing relevant judicial principles within a coherent framework based on the random-worlds method. It provides a structured analysis of key evidentiary principles, with a particular focus on the ``inverse ratio rule"--a controversial doctrine adopted by some courts. Although this rule has faced significant criticism, a formal proof demonstrates its validity, provided it is properly defined. Additionally, the paper examines the heightened copyright risks posed by generative AI, highlighting how extensive access to copyrighted material by generative models increases the risk of infringement. Utilizing the probabilistic approach, the Near Access-Free (NAF) condition, previously proposed as a potential mitigation strategy, is evaluated. The analysis reveals that while the NAF condition mitigates some infringement risks, its justifiability and efficacy are questionable in certain contexts. These findings demonstrate how a rigorous probabilistic approach can advance our understanding of copyright jurisprudence and its interaction with emerging technologies.


Elon Musk asks court to stop OpenAI from becoming a for-profit

Engadget

Elon Musk's attorneys filed for an injunction against OpenAI and Microsoft on Friday accusing the two of anticompetitive practices and seeking to stop OpenAI's conversion to a for-profit company. The filing, spotted by TechCrunch, also names OpenAI CEO Sam Altman, OpenAI President Greg Brockman, Microsoft's Dee Templeton and LinkedIn co-founder Reid Hoffman as defendants. Musk first sued OpenAI earlier this year for allegedly violating its founding mission of building AI "for the benefit of humanity," but withdrew the lawsuit a few months later. He then filed another lawsuit against OpenAI in a California federal court in August, and recently added Microsoft as a defendant. The new motion accuses OpenAI and Microsoft of telling investors not to fund OpenAI's competitors, such as Musk's xAI, of "benefitting from wrongfully obtained competitively sensitive information or coordination" through its relationship with Microsoft, and other alleged antitrust violations.


Free and Customizable Code Documentation with LLMs: A Fine-Tuning Approach

arXiv.org Artificial Intelligence

Automated documentation of programming source code is a challenging task with significant practical and scientific implications for the developer community. We present a large language model (LLM)-based application that developers can use as a support tool to generate basic documentation for any publicly available repository. Over the last decade, several papers have been written on generating documentation for source code using neural network architectures. With the recent advancements in LLM technology, some open-source applications have been developed to address this problem. However, these applications typically rely on the OpenAI APIs, which incur substantial financial costs, particularly for large repositories. Moreover, none of these open-source applications offer a fine-tuned model or features to enable users to fine-tune. Additionally, finding suitable data for fine-tuning is often challenging. Our application addresses these issues which is available at https://pypi.org/project/readme-ready/.


Generative Model for Synthesizing Ionizable Lipids: A Monte Carlo Tree Search Approach

arXiv.org Artificial Intelligence

Ionizable lipids are essential in developing lipid nanoparticles (LNPs) for effective messenger RNA (mRNA) delivery. While traditional methods for designing new ionizable lipids are typically time-consuming, deep generative models have emerged as a powerful solution, significantly accelerating the molecular discovery process. However, a practical challenge arises as the molecular structures generated can often be difficult or infeasible to synthesize. This project explores Monte Carlo tree search (MCTS)-based generative models for synthesizable ionizable lipids. Leveraging a synthetically accessible lipid building block dataset and two specialized predictors to guide the search through chemical space, we introduce a policy network guided MCTS generative model capable of producing new ionizable lipids with available synthesis pathways.