AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.41)

Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, Josh Tenenbaum

Learning to Infer Graphics Programs from Hand-Drawn Images

Neural Information Processing SystemsFeb-13-2026, 00:54:29 GMT

Neural Information Processing Systems http://nips.cc/

graphic program, neural network, spec, (13 more...)

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Neural Information Processing SystemsFeb-12-2026, 04:03:13 GMT

Write, Execute, Assess: Program Synthesis with a REPL

Kevin Ellis, Maxwell Nye, Yewen Pu, Felix Sosa, Josh Tenenbaum, Armando Solar-Lezama

Neural Information Processing Systems http://nips.cc/

program synthesis, spec, value function, (12 more...)

Country: North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, Josh Tenenbaum

Learning to Infer Graphics Programs from Hand-Drawn Images

Neural Information Processing SystemsNov-20-2025, 17:02:23 GMT

The model combines techniques from deep learning and program synthesis.

artificial intelligence, machine learning, natural language, (18 more...)

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > Quebec > Montreal (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Neural Information Processing SystemsOct-2-2025, 17:30:52 GMT

Write, Execute, Assess: Program Synthesis with a REPL

Kevin Ellis, Maxwell Nye, Yewen Pu, Felix Sosa, Josh Tenenbaum, Armando Solar-Lezama

We train a pair of models, a policy that proposes the new piece of code to write, and a value function that assesses the prospects of the code written so-far.

artificial intelligence, machine learning, natural language, (14 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceMar-19-2025

TikZero: Zero-Shot Text-Guided Graphics Program Synthesis

Belouadi, Jonas, Ilg, Eddy, Keuper, Margret, Tanaka, Hideki, Utiyama, Masao, Dabre, Raj, Eger, Steffen, Ponzetto, Simone Paolo

With the rise of generative AI, synthesizing figures from text captions becomes a compelling application. However, achieving high geometric precision and editability requires representing figures as graphics programs in languages like TikZ, and aligned training data (i.e., graphics programs with captions) remains scarce. Meanwhile, large amounts of unaligned graphics programs and captioned raster images are more readily available. We reconcile these disparate data sources by presenting TikZero, which decouples graphics program generation from text understanding by using image representations as an intermediary bridge. It enables independent training on graphics programs and captioned images and allows for zero-shot text-guided graphics program synthesis during inference. We show that our method substantially outperforms baselines that can only operate with caption-aligned graphics programs. Furthermore, when leveraging caption-aligned graphics programs as a complementary training signal, TikZero matches or exceeds the performance of much larger models, including commercial systems like GPT-4o. Our code, datasets, and select models are publicly available.

large language model, machine learning, natural language, (20 more...)

2503.11509

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > Japan (0.04)
(14 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

arXiv.org Artificial IntelligenceOct-21-2024

Learning to Synthesize Graphics Programs for Geometric Artworks

Bing, Qi, Zhang, Chaoyi, Cai, Weidong

Creating and understanding art has long been a hallmark of human ability. When presented with finished digital artwork, professional graphic artists can intuitively deconstruct and replicate it using various drawing tools, such as the line tool, paint bucket, and layer features, including opacity and blending modes. While most recent research in this field has focused on art generation, proposing a range of methods, these often rely on the concept of artwork being represented as a final image. To bridge the gap between pixel-level results and the actual drawing process, we present an approach that treats a set of drawing tools as executable programs. This method predicts a sequence of steps to achieve the final image, allowing for understandable and resolution-independent reproductions under the usage of a set of drawing commands. Our experiments demonstrate that our program synthesizer, Art2Prog, can comprehensively understand complex input images and reproduce them using high-quality executable programs. The experimental results evidence the potential of machines to grasp higher-level information from images and generate compact program-level descriptions.

artificial intelligence, machine learning, natural language, (18 more...)

2410.15768

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Asia (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceAug-15-2024

Can Large Language Models Understand Symbolic Graphics Programs?

Qiu, Zeju, Liu, Weiyang, Feng, Haiwen, Liu, Zhen, Xiao, Tim Z., Collins, Katherine M., Tenenbaum, Joshua B., Weller, Adrian, Black, Michael J., Schölkopf, Bernhard

Assessing the capabilities of large language models (LLMs) is often challenging, in part, because it is hard to find tasks to which they have not been exposed during training. We take one step to address this challenge by turning to a new task: focusing on symbolic graphics programs, which are a popular representation for graphics content that procedurally generates visual data. LLMs have shown exciting promise towards program synthesis, but do they understand symbolic graphics programs? Unlike conventional programs, symbolic graphics programs can be translated to graphics content. Here, we characterize an LLM's understanding of symbolic programs in terms of their ability to answer questions related to the graphics content. This task is challenging as the questions are difficult to answer from the symbolic programs alone -- yet, they would be easy to answer from the corresponding graphics content as we verify through a human experiment. To understand symbolic programs, LLMs may need to possess the ability to imagine how the corresponding graphics content would look without directly accessing the rendered visual content. We use this task to evaluate LLMs by creating a large benchmark for the semantic understanding of symbolic graphics programs. This benchmark is built via program-graphics correspondence, hence requiring minimal human efforts. We evaluate current LLMs on our benchmark to elucidate a preliminary assessment of their ability to reason about visual scenes from programs. We find that this task distinguishes existing LLMs and models considered good at reasoning perform better. Lastly, we introduce Symbolic Instruction Tuning (SIT) to improve this ability. Specifically, we query GPT4-o with questions and images generated by symbolic programs. Such data are then used to finetune an LLM. We also find that SIT data can improve the general instruction following ability of LLMs.

rect height, style, subnodeconstraint, (17 more...)

2408.08313

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)
(2 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Education (1.00)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsMar-14-2024, 01:06:13 GMT

fa14d4fe2f19414de3ebd9f63d5c0169-Reviews.html

This paper proposes a general method for solving image-based computer vision tasks using a generative probabilistic model that uses a graphics program to generate images. The method takes the standard Bayesian approach to frame the inference of the target variables, and uses Metropolis-Hastings to perform the inference. This framework is implemented for a CAPTCHA and a road-finding application, with favorable results reported for each one. The primary contribution of this paper is a proposal for using graphics programs as a key element of a generative model for image-based tasks. While their claim that there are no previous real-world image interpretation frameworks that combine computer graphics among the other elements they list (last paragraph of Section 1) seems accurate, their proposed system does not seem to qualify as such a framework unless it's under a restricted interpretation.

application, generative model, graphic program, (8 more...)

Genre: Summary/Review (0.51)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.56)

arXiv.org Artificial IntelligenceMay-31-2023

From Perception to Programs: Regularize, Overparameterize, and Amortize

Tang, Hao, Ellis, Kevin

Toward combining inductive reasoning with perception abilities, we develop techniques for neurosymbolic program synthesis where perceptual input is first parsed by neural nets into a low-dimensional interpretable representation, which is then processed by a synthesized program. We explore several techniques for relaxing the problem and jointly learning all modules end-to-end with gradient descent: multitask learning; amortized inference; overparameterization; and a differentiable strategy for penalizing lengthy programs. Collectedly this toolbox improves the stability of gradient-guided program search, and suggests ways of learning both how to perceive input as discrete abstractions, and how to symbolically process those abstractions as programs.

artificial intelligence, machine learning, operator, (17 more...)

2206.05922

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > France (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)