story element
A Character-Centric Creative Story Generation via Imagination
Park, Kyeongman, Kim, Minbeom, Jung, Kyomin
Creative story generation has long been a goal of NLP research. While existing methodologies have aimed to generate long and coherent stories, they fall significantly short of human capabilities in terms of diversity and character depth. To address this, we introduce a novel story generation framework called CCI (Character-centric Creative story generation via Imagination). CCI features two modules for creative story generation: IG (Image-Guided Imagination) and MW (Multi-Writer model). In the IG module, we utilize a text-to-image model to create visual representations of key story elements, such as characters, backgrounds, and main plots, in a more novel and concrete manner than text-only approaches. The MW module uses these story elements to generate multiple persona-description candidates and selects the best one to insert into the story, thereby enhancing the richness and depth of the narrative. We compared the stories generated by CCI and baseline models through statistical analysis, as well as human and LLM evaluations. The results showed that the IG and MW modules significantly improve various aspects of the stories' creativity. Furthermore, our framework enables interactive multi-modal story generation with users, opening up new possibilities for human-LLM integration in cultural development. Project page : https://www.2024cci.p-e.kr/
- North America > United States > Massachusetts (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
Toward a Human-Level Video Understanding Intelligence
Heo, Yu-Jung, Lee, Minsu, Choi, Seongho, Choi, Woo Suk, Shin, Minjung, Jung, Minjoon, Ryu, Jeh-Kwang, Zhang, Byoung-Tak
We aim to develop an AI agent that can watch video clips and have a conversation with human about the video story. Developing video understanding intelligence is a significantly challenging task, and evaluation methods for adequately measuring and analyzing the progress of AI agent are lacking as well. In this paper, we propose the Video Turing Test to provide effective and practical assessments of video understanding intelligence as well as human-likeness evaluation of AI agents. We define a general format and procedure of the Video Turing Test and present a case study to confirm the effectiveness and usefulness of the proposed test.
- Asia > South Korea > Seoul > Seoul (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Vision > Video Understanding (0.87)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.81)
- Information Technology > Artificial Intelligence > Issues > Turing's Test (0.61)
CogME: A Novel Evaluation Metric for Video Understanding Intelligence
Shin, Minjung, Kim, Jeonghoon, Choi, Seongho, Heo, Yu-Jung, Kim, Donghyun, Lee, Minsu, Zhang, Byoung-Tak, Ryu, Jeh-Kwang
Developing video understanding intelligence is quite challenging because it requires holistic integration of images, scripts, and sounds based on natural language processing, temporal dependency, and reasoning. Recently, substantial attempts have been made on several video datasets with associated question answering (QA) on a large scale. However, existing evaluation metrics for video question answering (VideoQA) do not provide meaningful analysis. To make progress, we argue that a well-made framework, established on the way humans understand, is required to explain and evaluate the performance of understanding in detail. Then we propose a top-down evaluation system for VideoQA, based on the cognitive process of humans and story elements: Cognitive Modules for Evaluation (CogME). CogME is composed of three cognitive modules: targets, contents, and thinking. The interaction among the modules in the understanding procedure can be expressed in one sentence as follows: "I understand the CONTENT of the TARGET through a way of THINKING." Each module has sub-components derived from the story elements. We can specify the required aspects of understanding by annotating the sub-components to individual questions. CogME thus provides a framework for an elaborated specification of VideoQA datasets. To examine the suitability of a VideoQA dataset for validating video understanding intelligence, we evaluated the baseline model of the DramaQA dataset by applying CogME. The evaluation reveals that story elements are unevenly reflected in the existing dataset, and the model based on the dataset may cause biased predictions. Although this study has only been able to grasp a narrow range of stories, we expect that it offers the first step in considering the cognitive process of humans on the video understanding intelligence of humans and AI.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > South Korea > Seoul > Seoul (0.05)
- Europe > Germany > Berlin (0.04)
- Media (0.46)
- Leisure & Entertainment (0.46)
A computer reads a story - Visage Technologies
Sequencing is a task for children that aims to improve understanding of the temporal occurrence in a sequence of events, so that they sort various images (sometimes with captions) into a coherent story. Researchers from Virginia Tech and TTI Chicago have proposed the task of machine-learning sequencing – given a jumbled set of aligned image-caption pairs that belong to a story, the task is that the computer needs to sort them in a way that they form a consistent story. They have used stories from the Sequential Image Narrative Dataset where a set of 5 aligned image-caption pairs together form a coherent story, and given a jumbled input story, they have trained machine-learning models to sort them. They have proposed a task of visual story sequencing and implemented two approaches to solve the task. The first one is based on individual story elements to predict the position, and the other one is based on the pairwise story elements to predict the relative order of the story elements.
- North America > United States > Virginia (0.27)
- North America > United States > Illinois > Cook County > Chicago (0.27)
A Tripartite Plan-Based Model of Narrative for Narrative Discourse Generation
Barot, Camille (North Carolina State University) | Potts, Colin Murray (North Carolina State University) | Young, R. Michael (North Carolina State University)
The story is particular medium. However, the discourse layer is not simply a conceptualization of the world of the narrative, with the an ordered subset of elements of the story layer. Genette characters, actions and events that it contains, while the discourse argues that every discourse implies a narrator. In this, the is composed of the communicative elements that participate discourse is an intentional structure through which the narrator in its telling. Research on computational models of "regulates the narrative information" given to the audience, narrative has produced many models of story, based for instance and its representation should include these intentions.
- North America > United States > North Carolina > Wake County > Raleigh (0.04)
- North America > United States > New York > Tompkins County > Ithaca (0.04)
- North America > United States > New York > Monroe County > Rochester (0.04)
- (3 more...)