AITopics

2411.14405

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.66)

Generative AI for Brane Configurations, Tropical Coamoeba and 4d N=1 Quiver Gauge Theories

Seong, Rak-Kyeong

We introduce a generative AI model to obtain Type IIB brane configurations that realize toric phases of a family of 4d N=1 supersymmetric gauge theories. These 4d N=1 quiver gauge theories are worldvolume theories of a D3-brane probing a toric Calabi-Yau 3-fold. The Type IIB brane configurations that realize this family of 4d N=1 theories are known as brane tilings and are given by the tropical coamoeba projection of the mirror curve associated with the toric Calabi-Yau 3-fold. The shape of the mirror curve and its coamoeba projection, as well as the corresponding Type IIB brane configuration and the toric phase of the 4d N=1 theory, all depend on the complex structure moduli parameterizing the mirror curve. We train a generative AI model, a conditional variational autoencoder (CVAE), that takes a choice of complex structure moduli as input and generates the corresponding tropical coamoeba. This enables us not only to obtain a high-resolution representation of the entire phase space for a family of brane tilings corresponding to the same toric Calabi-Yau 3-fold, but also to continuously track the movements of the mirror curve and individual branes in the corresponding Type IIB brane configurations during phase transitions associated with Seiberg duality.

artificial intelligence, latexit sha1, machine learning, (14 more...)

2411.16033

Country:

Asia > South Korea > Ulsan > Ulsan (0.04)
North America > United States > New York > Suffolk County > Stony Brook (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry: Education (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.90)

AVID: Adapting Video Diffusion Models to World Models

Rigter, Marc, Gupta, Tarun, Hilmkil, Agrin, Ma, Chao

Large-scale generative models have achieved remarkable success in a number of domains. However, for sequential decision-making problems, such as robotics, action-labelled data is often scarce and therefore scaling-up foundation models for decision-making remains a challenge. A potential solution lies in leveraging widely-available unlabelled videos to train world models that simulate the consequences of actions. If the world model is accurate, it can be used to optimize decision-making in downstream tasks. Image-to-video diffusion models are already capable of generating highly realistic synthetic videos. However, these models are not action-conditioned, and the most powerful models are closedsource which means they cannot be finetuned. In this work, we propose to adapt pretrained video diffusion models to action-conditioned world models, without access to the parameters of the pretrained model. Our approach, AVID, trains an adapter on a small domain-specific dataset of action-labelled videos. AVID uses a learned mask to modify the intermediate outputs of the pretrained model and generate accurate action-conditioned videos. We evaluate AVID on video game and real-world robotics data, and show that it outperforms existing baselines for diffusion model adaptation. Our results demonstrate that if utilized correctly, pretrained video models have the potential to be powerful tools for embodied AI. Large generative models trained on web-scale data have driven rapid improvement in natural language processing (Brown, 2020; Touvron et al., 2023; Achiam et al., 2023), image generation (Rombach et al., 2022), and video generation (OpenAI, 2024).

diffusion model, machine learning, natural language, (17 more...)

2410.12822

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre:

Research Report > New Finding (0.86)
Research Report > Promising Solution (0.66)

Industry: Leisure & Entertainment (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
(2 more...)

Ferraro, Antonino, Galli, Antonio, La Gatta, Valerio, Postiglione, Marco, Orlando, Gian Marco, Russo, Diego, Riccio, Giuseppe, Romano, Antonio, Moscato, Vincenzo

Agent-Based Modelling Meets Generative AI in Social Network Simulations

Agent-Based Modelling (ABM) has emerged as an essential tool for simulating social networks, encompassing diverse phenomena such as information dissemination, influence dynamics, and community formation. However, manually configuring varied agent interactions and information flow dynamics poses challenges, often resulting in oversimplified models that lack real-world generalizability. Integrating modern Large Language Models (LLMs) with ABM presents a promising avenue to address these challenges and enhance simulation fidelity, leveraging LLMs' human-like capabilities in sensing, reasoning, and behavior. In this paper, we propose a novel framework utilizing LLM-empowered agents to simulate social network users based on their interests and personality traits. The framework allows for customizable agent interactions resembling various social network platforms, including mechanisms for content resharing and personalized recommendations. We validate our framework using a comprehensive Twitter dataset from the 2020 US election, demonstrating that LLM-agents accurately replicate real users' behaviors, including linguistic patterns and political inclinations. These agents form homogeneous ideological clusters and retain the main themes of their community. Notably, preference-based recommendations significantly influence agent behavior, promoting increased engagement, network homophily and the formation of echo chambers. Overall, our findings underscore the potential of LLM-agents in advancing social media simulations and unraveling intricate online dynamics.

large language model, machine learning, natural language, (19 more...)

2411.16031

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Campania > Naples (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Services (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Media > News (0.93)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.65)

Visual Counter Turing Test (VCT^2): Discovering the Challenges for AI-Generated Image Detection and Introducing Visual AI Index (V_AI)

Imanpour, Nasrin, Bajpai, Shashwat, Ghosh, Subhankar, Sankepally, Sainath Reddy, Borah, Abhilekh, Abdullah, Hasnat Md, Kosaraju, Nishoak, Dixit, Shreyas, Aziz, Ashhar, Biswas, Shwetangshu, Jain, Vinija, Chadha, Aman, Sheth, Amit, Das, Amitava

The proliferation of AI techniques for image generation, coupled with their increasing accessibility, has raised significant concerns about the potential misuse of these images to spread misinformation. Recent AI-generated image detection (AGID) methods include CNNDetection, NPR, DM Image Detection, Fake Image Detection, DIRE, LASTED, GAN Image Detection, AIDE, SSP, DRCT, RINE, OCC-CLIP, De-Fake, and Deep Fake Detection. However, we argue that the current state-of-the-art AGID techniques are inadequate for effectively detecting contemporary AI-generated images and advocate for a comprehensive reevaluation of these methods. We introduce the Visual Counter Turing Test (VCT^2), a benchmark comprising ~130K images generated by contemporary text-to-image models (Stable Diffusion 2.1, Stable Diffusion XL, Stable Diffusion 3, DALL-E 3, and Midjourney 6). VCT^2 includes two sets of prompts sourced from tweets by the New York Times Twitter account and captions from the MS COCO dataset. We also evaluate the performance of the aforementioned AGID techniques on the VCT$^2$ benchmark, highlighting their ineffectiveness in detecting AI-generated images. As image-generative AI models continue to evolve, the need for a quantifiable framework to evaluate these models becomes increasingly critical. To meet this need, we propose the Visual AI Index (V_AI), which assesses generated images from various visual perspectives, including texture complexity and object coherence, setting a new standard for evaluating image-generative AI models. To foster research in this domain, we make our https://huggingface.co/datasets/anonymous1233/COCO_AI and https://huggingface.co/datasets/anonymous1233/twitter_AI datasets publicly available.

artificial intelligence, machine learning, midjourney 6, (16 more...)

2411.16754

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > New York (0.04)
North America > United States > Washington (0.04)
(5 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Media > News (1.00)
Information Technology (1.00)
Government (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Issues (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.93)

Bura, Chiranjeevi, Myakala, Praveen Kumar

Advancing Transformative Education: Generative AI as a Catalyst for Equity and Innovation

Generative AI is transforming education by enabling personalized learning, enhancing administrative efficiency, and fostering creative engagement. This paper explores the opportunities and challenges these tools bring to pedagogy, proposing actionable frameworks to address existing equity gaps. Ethical considerations such as algorithmic bias, data privacy, and AI role in human centric education are emphasized. The findings underscore the need for responsible AI integration that ensures accessibility, equity, and innovation in educational systems.

artificial intelligence, machine learning, natural language, (5 more...)

2411.15971

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (0.53)
Education (0.53)
Materials > Chemicals > Specialty Chemicals (0.40)

Technology:

Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.73)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.60)

arXiv.org Artificial IntelligenceNov-23-2024

Classifier-Free Guidance inside the Attraction Basin May Cause Memorization

Jain, Anubhav, Kobayashi, Yuya, Shibuya, Takashi, Takida, Yuhta, Memon, Nasir, Togelius, Julian, Mitsufuji, Yuki

Diffusion models are prone to exactly reproduce images from the training data. This exact reproduction of the training data is concerning as it can lead to copyright infringement and/or leakage of privacy-sensitive information. In this paper, we present a novel way to understand the memorization phenomenon, and propose a simple yet effective approach to mitigate it. We argue that memorization occurs because of an attraction basin in the denoising process which steers the diffusion trajectory towards a memorized image. However, this can be mitigated by guiding the diffusion trajectory away from the attraction basin by not applying classifier-free guidance until an ideal transition point occurs from which classifier-free guidance is applied. This leads to the generation of non-memorized images that are high in image quality and well-aligned with the conditioning mechanism. To further improve on this, we present a new guidance technique, \emph{opposite guidance}, that escapes the attraction basin sooner in the denoising process. We demonstrate the existence of attraction basins in various scenarios in which memorization occurs, and we show that our proposed approach successfully mitigates memorization.

artificial intelligence, machine learning, memorization, (17 more...)

2411.16738

Country:

North America > United States > New York (0.04)
North America > United States > Connecticut > New Haven County > New Haven (0.04)
Europe > Netherlands (0.04)
Europe > Iceland > Capital Region > Reykjavik (0.04)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (0.54)
Transportation (0.46)
Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Kyaw, Alexander Htet, Jeon, Se Hwan, Smith, Miana, Gershenfeld, Neil

Speech to Reality: On-Demand Production using Natural Language, 3D Generative AI, and Discrete Robotic Assembly

arXiv.org Artificial IntelligenceNov-23-2024

We present a system that transforms speech into physical objects by combining 3D generative Artificial Intelligence with robotic assembly. The system leverages natural language input to make design and manufacturing more accessible, enabling individuals without expertise in 3D modeling or robotic programming to create physical objects. We propose utilizing discrete robotic assembly of lattice-based voxel components to address the challenges of using generative AI outputs in physical production, such as design variability, fabrication speed, structural integrity, and material waste. The system interprets speech to generate 3D objects, discretizes them into voxel components, computes an optimized assembly sequence, and generates a robotic toolpath. The results are demonstrated through the assembly of various objects, ranging from chairs to shelves, which are prompted via speech and realized within 5 minutes using a 6-axis robotic arm.

discrete robotic assembly, machine learning, natural language, (5 more...)

2409.1839

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.89)

MIT Technology ReviewNov-22-2024, 13:10:00 GMT

The Download: how OpenAI tests its models, and the ethics of uterus transplants

OpenAI has lifted the lid (just a crack) on its safety-testing processes. It has put out two papers describing how it stress-tests its powerful large language models to try to identify potential harmful or otherwise unwanted behavior, an approach known as red-teaming. The first paper describes how OpenAI directs an extensive network of human testers outside the company to vet the behavior of its models before they are released. The second presents a new way to automate parts of the testing process, using a large language model like GPT-4 to come up with novel ways to bypass its own guardrails. MIT Technology Review got an exclusive preview of the work.

large language model, machine learning, natural language, (8 more...)

MIT Technology Review

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.89)

Desai, Akshar Prabhu, Mallya, Ganesh Satish, Luqman, Mohammad, Ravi, Tejasvi, Kota, Nithya, Yadav, Pranjul

Opportunities and Challenges of Generative-AI in Finance

arXiv.org Artificial IntelligenceNov-22-2024

Gen-AI techniques are able to improve understanding of context and nuances in language modeling, translation between languages, handle large volumes of data, provide fast, low-latency responses and can be fine-tuned for various tasks and domains. In this manuscript, we present a comprehensive overview of the applications of Gen-AI techniques in the finance domain. In particular, we present the opportunities and challenges associated with the usage of Gen-AI techniques. We also illustrate the various methodologies which can be used to train Gen-AI techniques and present the various application areas of Gen-AI technologies in the finance ecosystem. To the best of our knowledge, this work represents the most comprehensive summarization of Gen-AI techniques within the financial domain. The analysis is designed for a deep overview of areas marked for substantial advancement while simultaneously pin-point those warranting future prioritization. We also hope that this work would serve as a conduit between finance and other domains, thus fostering the cross-pollination of innovative concepts and practices.

large language model, machine learning, natural language, (19 more...)

2410.15653

Country:

Asia > Singapore (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New York > Kings County > New York City (0.04)
Europe (0.04)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.64)