AITopics | Workflow

We propose a new polynomial-time algorithm for this task which is based on the seminal Lenstra-Lenstra-Lovasz (LLL) lattice basis reduction algorithm.

algorithm, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Genre: Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Add feedback

Cultivating Game Sense for Yourself: Making VLMs Gaming Experts

Lu, Wenxuan, He, Jiangyang, Zhang, Zhanqiu, Guo, Yiwen, Zang, Tianning

arXiv.org Artificial IntelligenceMar-27-2025

Developing agents capable of fluid gameplay in first/third-person games without API access remains a critical challenge in Artificial General Intelligence (AGI). Recent efforts leverage Vision Language Models (VLMs) as direct controllers, frequently pausing the game to analyze screens and plan action through language reasoning. However, this inefficient paradigm fundamentally restricts agents to basic and non-fluent interactions: relying on isolated VLM reasoning for each action makes it impossible to handle tasks requiring high reactivity (e.g., FPS shooting) or dynamic adaptability (e.g., ACT combat). To handle this, we propose a paradigm shift in gameplay agent design: instead of directly controlling gameplay, VLM develops specialized execution modules tailored for tasks like shooting and combat. These modules handle real-time game interactions, elevating VLM to a high-level developer. Building upon this paradigm, we introduce GameSense, a gameplay agent framework where VLM develops task-specific game sense modules by observing task execution and leveraging vision tools and neural network training pipelines. These modules encapsulate action-feedback logic, ranging from direct action rules to neural network-based decisions. Experiments demonstrate that our framework is the first to achieve fluent gameplay in diverse genres, including ACT, FPS, and Flappy Bird, setting a new benchmark for game-playing agents.

artificial intelligence, gsm, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2503.21263

Genre:

Workflow (0.68)
Research Report (0.50)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

StarFlow: Generating Structured Workflow Outputs From Sketch Images

Bechard, Patrice, Wang, Chao, Abaskohi, Amirhossein, Rodriguez, Juan, Pal, Christopher, Vazquez, David, Gella, Spandana, Rajeswar, Sai, Taslakian, Perouz

arXiv.org Artificial IntelligenceMar-27-2025

Workflows are a fundamental component of automation in enterprise platforms, enabling the orchestration of tasks, data processing, and system integrations. Despite being widely used, building workflows can be complex, often requiring manual configuration through low-code platforms or visual programming tools. To simplify this process, we explore the use of generative foundation models, particularly vision-language models (VLMs), to automatically generate structured workflows from visual inputs. Translating hand-drawn sketches or computer-generated diagrams into executable workflows is challenging due to the ambiguity of free-form drawings, variations in diagram styles, and the difficulty of inferring execution logic from visual elements. To address this, we introduce StarFlow, a framework for generating structured workflow outputs from sketches using vision-language models. We curate a diverse dataset of workflow diagrams -- including synthetic, manually annotated, and real-world samples -- to enable robust training and evaluation. We finetune and benchmark multiple vision-language models, conducting a series of ablation studies to analyze the strengths and limitations of our approach. Our results show that finetuning significantly enhances structured workflow generation, outperforming large vision-language models on this task.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2503.21889

Country:

Europe > Switzerland (0.28)
Europe > Austria (0.28)

Genre:

Workflow (1.00)
Research Report > New Finding (0.86)

Industry: Information Technology > Software (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Program Synthesis and Semantic Parsing with Learned Code Idioms

Eui Chul Shin, Miltiadis Allamanis, Marc Brockschmidt, Alex Polozov

Neural Information Processing SystemsMar-26-2025, 22:53:31 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Workflow (0.46)

Add feedback

Recurrent Relational Networks

Rasmus Palm, Ulrich Paquet, Ole Winther

Neural Information Processing SystemsMar-26-2025, 22:26:11 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > Canada (0.14)

Genre: Workflow (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.94)

Add feedback

6b8dfb8c0c12e6fafc6c256cb08a5ca7-Paper-Conference.pdf

Neural Information Processing SystemsMar-26-2025, 18:11:50 GMT

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Workflow (0.51)

Industry:

Leisure & Entertainment > Games (0.72)
Materials > Metals & Mining > Iron (0.31)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Add feedback

Designing Cell-Type-Specific Promoter Sequences Using Conservative Model-Based Optimization, Michael H. Herschl

Neural Information Processing SystemsMar-26-2025, 17:43:41 GMT

Gene therapies have the potential to treat disease by delivering therapeutic genetic cargo to disease-associated cells. One limitation to their widespread use is the lack of short regulatory sequences, or promoters, that differentially induce the expression of delivered genetic cargo in target cells, minimizing side effects in other cell types. Such cell-type-specific promoters are difficult to discover using existing methods, requiring either manual curation or access to large datasets of promoter-driven expression from both targeted and untargeted cells. Model-based optimization (MBO) has emerged as an effective method to design biological sequences in an automated manner, and has recently been used in promoter design methods. However, these methods have only been tested using large training datasets that are expensive to collect, and focus on designing promoters for markedly different cell types, overlooking the complexities associated with designing promoters for closely related cell types that share similar regulatory features. Therefore, we introduce a comprehensive framework for utilizing MBO to design promoters in a dataefficient manner, with an emphasis on discovering promoters for similar cell types. We use conservative objective models (COMs) for MBO and highlight practical considerations such as best practices for improving sequence diversity, getting estimates of model uncertainty, and choosing the optimal set of sequences for experimental validation. Using three leukemia cell lines (Jurkat, K562, and THP1), we show that our approach discovers many novel cell-type-specific promoters after experimentally validating the designed sequences. For K562 cells, in particular, we discover a promoter that has 75.85% higher cell-type-specificity than the best promoter from the initial dataset used to train our models.

artificial intelligence, machine learning, sequence, (19 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.46)

Genre:

Research Report > Experimental Study (1.00)
Workflow (0.69)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Probabilistic Inference in Reinforcement Learning Done Right

Neural Information Processing SystemsMar-26-2025, 16:36:24 GMT

A popular perspective in Reinforcement learning (RL) casts the problem as probabilistic inference on a graphical model of the Markov decision process (MDP). The core object of study is the probability of each state-action pair being visited under the optimal policy. Previous approaches to approximate this quantity can be arbitrarily poor, leading to algorithms that do not implement genuine statistical inference and consequently do not perform well in challenging problems. In this work, we undertake a rigorous Bayesian treatment of the posterior probability of state-action optimality and clarify how it flows through the MDP. We first reveal that this quantity can indeed be used to generate a policy that explores efficiently, as measured by regret. Unfortunately, computing it is intractable, so we derive a new variational Bayesian approximation yielding a tractable convex optimization problem and establish that the resulting policy also explores efficiently. We call our approach VAPOR and show that it has strong connections to Thompson sampling, K-learning, and maximum entropy exploration. We conclude with some experiments demonstrating the performance advantage of a deep RL version of VAPOR.

Add feedback

Deep Active Learning with a Neural Architecture Search

Yonatan Geifman, Ran El-Yaniv

Neural Information Processing SystemsMar-26-2025, 16:34:42 GMT

We consider active learning of deep neural networks. Most active learning works in this context have focused on studying effective querying mechanisms and assumed that an appropriate network architecture is a priori known for the problem at hand. We challenge this assumption and propose a novel active strategy whereby the learning algorithm searches for effective architectures on the fly, while actively learning. We apply our strategy using three known querying techniques (softmax response, MC-dropout, and coresets) and show that the proposed approach overwhelmingly outperforms active learning using fixed architectures.

artificial intelligence, deep learning, machine learning, (15 more...)

Neural Information Processing Systems

Genre: Workflow (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

On the Effects of Data Scale on UI Control Agents

Neural Information Processing SystemsMar-26-2025, 16:15:56 GMT

Autonomous agents that control user interfaces to accomplish human tasks are emerging. Leveraging LLMs to power such agents has been of special interest, but unless fine-tuned on human-collected task demonstrations, performance is still relatively low. In this work we study whether fine-tuning alone is a viable approach for building real-world UI control agents.

large language model, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Country: Asia (0.14)

Genre: Workflow (1.00)

Industry: Information Technology (1.00)

Technology: