Goto

Collaborating Authors

 progress


FilmAgent: A Multi-Agent Framework for End-to-End Film Automation in Virtual 3D Spaces

Xu, Zhenran, Wang, Longyue, Wang, Jifang, Li, Zhouyi, Shi, Senbao, Yang, Xue, Wang, Yiyu, Hu, Baotian, Yu, Jun, Zhang, Min

arXiv.org Artificial Intelligence

Virtual film production requires intricate decision-making processes, including scriptwriting, virtual cinematography, and precise actor positioning and actions. Motivated by recent advances in automated decision-making with language agent-based societies, this paper introduces FilmAgent, a novel LLM-based multi-agent collaborative framework for end-to-end film automation in our constructed 3D virtual spaces. FilmAgent simulates various crew roles, including directors, screenwriters, actors, and cinematographers, and covers key stages of a film production workflow: (1) idea development transforms brainstormed ideas into structured story outlines; (2) scriptwriting elaborates on dialogue and character actions for each scene; (3) cinematography determines the camera setups for each shot. A team of agents collaborates through iterative feedback and revisions, thereby verifying intermediate scripts and reducing hallucinations. We evaluate the generated videos on 15 ideas and 4 key aspects. Human evaluation shows that FilmAgent outperforms all baselines across all aspects and scores 3.98 out of 5 on average, showing the feasibility of multi-agent collaboration in filmmaking. Further analysis reveals that FilmAgent, despite using the less advanced GPT-4o model, surpasses the single-agent o1, showing the advantage of a well-coordinated multi-agent system. Lastly, we discuss the complementary strengths and weaknesses of OpenAI's text-to-video model Sora and our FilmAgent in filmmaking.


AI Democratization a Work in Progress, H2O's Ambati Says

#artificialintelligence

While only about 1% of companies are making the most of their data today, real progress is being made in democratizing the use of AI, and the future of business automation via AI is quite bright, H2O.ai's CEO and founder Sri Ambati said before a pair of H2O World conferences this week. "There's still a long way to go from where we are. It's in the earliest phases of adoption," Ambati told Datanami in an interview earlier this month. "You can see that only 1%, or less than 1%, of the world's companies can truly leverage their data. So that means 99% needs further adoption, simplification, and cultural transformation to use data and AI. It's going to take the next 10 to 20 years."


Trust Artificial Intelligence? Still A Work In Progress, Survey Shows

#artificialintelligence

Our dependency on AI-based outputs seems to grow every day, both from a business as well as personal perspective. But are we willing to fully trust this output? Are we sure the data fed into these systems is accurate? Are the decision models and algorithms kept up to date? Are humans kept in the loop?


The Progress Of AI - AI Summary

#artificialintelligence

Look at this: college students are sharing (anonymously) that they've started using AI tools to generate essays that can bypass anti-plagiarism software and score an A. The widespread use of the tools could reshape education and force schools to figure out new writing prompts or entirely fresh ways of assessing student performance to avoid being duped by the technology. Most projections have the AI niche reaching over $420 billion in total market size by 2028, a compound annual growth rate of 39.4 percent. Google is in talks to invest at least $200 million into AI start-up Cohere Inc., according to people familiar with the matter; another sign of the escalating arms race among large technology companies in the sector. There are some harbored fears surrounding AI ranging from doomsday scenarios to simple ethics concerns, but the overall trend is clear, and investors seem to have confidence that humanity will make the necessary adjustments to coexist with this new tech. Look at this: college students are sharing (anonymously) that they've started using AI tools to generate essays that can bypass anti-plagiarism software and score an A. The widespread use of the tools could reshape education and force schools to figure out new writing prompts or entirely fresh ways of assessing student performance to avoid being duped by the technology.


Research in Progress

AI Magazine

Computer Scaence Department Yale University THE COGNITION AND PROGRAMMING PROJECT (CAPP) in the Computer Science Department at Yale University is an interdisciplinary group exploring a wide range of issues in programming. 'This project is currently being funded by NSF RISE, under grant number SED-81-12403 'This project is currently being funded by NSF IST, under grant number IST-81-14840 We have also shown that when the language construct, agrees with people's natural problem solving strategies they can learn to use such constructs effectively. The implication is that language dcsigners should be more sensitive to cognitive capabilities which people bring to programming and that computing educators should be aware of the systematic misconceptions which arise due to cognztively poor programming language constructs. Using our theory of programming plans, we are developing measures of program complexity that are based on the underlying mental effort needed to understand programs. This approach is in contrast to typical measures of program complexity which are sensitive to only surface features of programs.


Research in Progress

AI Magazine

Automated Problem Solving Group Jet Propulsion Laboratory 4800 Oak Grove Dr. Pasadena, California 91109 AI research at JPL started in 1972 when design and construction of an experimental "Mars Rover" began. Early in that effort, it was recognized that rover planning capabilities were inadequate. Research in planning was begun in 1975, and work on a succession of AI expert systems of steadily increasing power has continued to the present. Within the group, we have concentrated our efforts on expert systems, although work on vision and robotics has continued in a separate organization, with which we have maintained informal contacts. The thrust of our work has been to build expert systems that can be applied in a real-world environment, and to actually put our systems into such environments, taking a consultative responsibility for meeting user requirements.


Research in Progress

AI Magazine

In terms of basic research, our current focus is the development, of broadly applicable techniques for description and matching of structure in sensory data. Such techniques appear to lmderlie virtually every aspect of early and intermediate vision, such as edge and region finding, perceptual organization and grouping, and the recovery of 3-D shape from contour, texture, stereo and motion They appear to be equally important in other sensory domains, such as audition (e g, for describing the structure in spectrograms.) In particular, we are dealing with the problem of grey-level inspection, and are constructing a vision workbench to allow rapid experimentation with alternative techniques Finally, WC are examining a variety of special-purpose architectures for image processing. These range from a SUN (MC68000-based) workstation, augment,cd with high-speed pipelined VLSI components, to a massively parallel architerture involving a thousand processors and a novel interconnection network. Knowledge Representation Contact: Ronald J. Brachman Having had experience with knowledge representation syst,ems designed to support "common sense" reasoning, we are developing and implementing a new framework for representation and reasoning in arcas requiring "expertise."


345

AI Magazine

Conspicuously absent from the 5th Generation Computer Project's proclaimed goals is one vitally important in a 1990's knowledge-intensive society.....the ability to help people tame mountains of video-based information. A decade from now, the nation will be crisscrossed with fiberoptic bundles capable of simultaneously carrying thousands of hiresolution video conversations, and solid-state video cameras will be as abundant as microphone pickup devices are today. In short, the voice-telephone and printed-page information networks over which we communicate will be joined by 2-way, super-narrowcast video, where each knowledge worker both receives product from myriad sources ad reshapes and originates his own unique product. The main activities interactive video will support are the same ones that will occupy people a decade from nowlearning and teaching. Already, one can "walk through" homes for sale thousands of miles away, learn how to assemble, operate and fix complex machinery, drive around the streets of Aspen, Colorado, and learn facial communication skills using this powerful medium.


Starting a Knowledge Engineering Project: A Step-by-Step Approach

AI Magazine

Artificial Intelligence Department, Computer Resenrch Laboratory, Tektronix, 1, Post Office Box 500, Beaverton, Oregon 97077 Getting started on a new knowledge engineering project is a difficult and challenging task, even for those who have done it before. For those who haven't, the task can often prove impossible. One reason is that the requirementsoriented methods and intuitions learned in the development of other types of software do not carry over well to the knowledge engineering task. Another reason is that methodologies for developing expert systems by extracting, representing, and manipulating an expert's knowledge have been slow in coming. At Tektronix, we have been using a step-by-step approach to prototyping expert systems for over two years now.


Transfer Learning Progress and Potential

AI Magazine

As evidenced by the articles in this special issue, transfer learning has come a long way in the past five or so years, partially because of DARPA's Transfer Learning program, which sponsored much of the work reported in this issue. There is a Transfer Learning Toolkit for Matlab available on the web. Transfer learning has developed techniques for classification, regression, and clustering (as summarized in Pan and Yang's 2009 survey) and for complex interactive tasks that are often best addressed by reinforcement learning techniques. However, there is a more practical and more feasible goal for transfer learning against which progress is being made. An engineering-oriented goal of artificial intelligence that could be enabled by transfer learning is the ability to construct a large number of diverse applications not from scratch, but by taking advantage of knowledge already acquired and formally represented for other purposes.