raccoon
RACCOON: A Retrieval-Augmented Generation Approach for Location Coordinate Capture from News Articles
Lin, Jonathan, Joshi, Aditya, Paik, Hye-young, Doung, Tri Dung, Gurdasani, Deepti
Geocoding involves automatic extraction of location coordinates of incidents reported in news articles, and can be used for epidemic intelligence or disaster management. This paper introduces Retrieval-Augmented Coordinate Capture Of Online News articles (RACCOON), an open-source geocoding approach that extracts geolocations from news articles. RACCOON uses a retrieval-augmented generation (RAG) approach where candidate locations and associated information are retrieved in the form of context from a location database, and a prompt containing the retrieved context, location mentions and news articles is fed to an LLM to generate the location coordinates. Our evaluation on three datasets, two underlying LLMs, three baselines and several ablation tests based on the components of RACCOON demonstrate the utility of RACCOON. To the best of our knowledge, RACCOON is the first RAG-based approach for geocoding using pre-trained LLMs.
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom (0.04)
Dynamic Ensemble Reasoning for LLM Experts
Hu, Jinwu, Wang, Yufeng, Zhang, Shuhai, Zhou, Kai, Chen, Guohao, Hu, Yu, Xiao, Bin, Tan, Mingkui
Ensemble reasoning for the strengths of different LLM experts is critical to achieving consistent and satisfactory performance on diverse inputs across a wide range of tasks. However, existing LLM ensemble methods are either computationally intensive or incapable of leveraging complementary knowledge among LLM experts for various inputs. In this paper, we propose a Dynamic Ensemble Reasoning paradigm, called DER to integrate the strengths of multiple LLM experts conditioned on dynamic inputs. Specifically, we model the LLM ensemble reasoning problem as a Markov Decision Process (MDP), wherein an agent sequentially takes inputs to request knowledge from an LLM candidate and passes the output to a subsequent LLM candidate. Moreover, we devise a reward function to train a DER-Agent to dynamically select an optimal answering route given the input questions, aiming to achieve the highest performance with as few computational resources as possible. Last, to fully transfer the expert knowledge from the prior LLMs, we develop a Knowledge Transfer Prompt (KTP) that enables the subsequent LLM candidates to transfer complementary knowledge effectively. Experiments demonstrate that our method uses fewer computational resources to achieve better performance compared to state-of-the-art baselines.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Hong Kong (0.04)
- Asia > China > Chongqing Province > Chongqing (0.04)
It's the AI Election Year
In the largest global election year yet, generative AI is already being used to trick and manipulate voters around the world. Will this growing trend have real impact? Today on WIRED Politics Lab, we talk about a new online project that will be tracking the use of AI in elections around the world. Plus, Nilesh Christopher dives into the lucrative industry of deepfakes, and how politicians are using them to bombard Indian voters. Be sure to subscribe to the WIRED Politics Lab newsletter here.
- Asia > India (0.40)
- Africa > South Africa (0.18)
- North America > United States > New Hampshire (0.06)
- Asia > Indonesia (0.06)
- Government > Voting & Elections (1.00)
- Government > Regional Government > Africa Government (0.33)
RACCooN: Remove, Add, and Change Video Content with Auto-Generated Narratives
Yoon, Jaehong, Yu, Shoubin, Bansal, Mohit
Recent video generative models primarily rely on carefully written text prompts for specific tasks, like inpainting or style editing. They require labor-intensive textual descriptions for input videos, hindering their flexibility to adapt personal/raw videos to user specifications. This paper proposes RACCooN, a versatile and user-friendly video-to-paragraph-to-video generative framework that supports multiple video editing capabilities such as removal, addition, and modification, through a unified pipeline. RACCooN consists of two principal stages: Video-to-Paragraph (V2P) and Paragraph-to-Video (P2V). In the V2P stage, we automatically describe video scenes in well-structured natural language, capturing both the holistic context and focused object details. Subsequently, in the P2V stage, users can optionally refine these descriptions to guide the video diffusion model, enabling various modifications to the input video, such as removing, changing subjects, and/or adding new objects. The proposed approach stands out from other methods through several significant contributions: (1) RACCooN suggests a multi-granular spatiotemporal pooling strategy to generate well-structured video descriptions, capturing both the broad context and object details without requiring complex human annotations, simplifying precise video content editing based on text for users. (2) Our video generative model incorporates auto-generated narratives or instructions to enhance the quality and accuracy of the generated content. It supports the addition of video objects, inpainting, and attribute modification within a unified framework, surpassing existing video editing and inpainting benchmarks. The proposed framework demonstrates impressive versatile capabilities in video-to-paragraph generation, video content editing, and can be incorporated into other SoTA video generative models for further enhancement.
- North America > United States > North Carolina (0.04)
- Europe > Netherlands > North Holland > Amsterdam (0.04)
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Mañas, Oscar, Astolfi, Pietro, Hall, Melissa, Ross, Candace, Urbanek, Jack, Williams, Adina, Agrawal, Aishwarya, Romero-Soriano, Adriana, Drozdzal, Michal
Impressive advances in text-to-image (T2I) generative models have yielded a plethora of high performing models which are able to generate aesthetically appealing, photorealistic images. Despite the progress, these models still struggle to produce images that are consistent with the input prompt, oftentimes failing to capture object quantities, relations and attributes properly. Existing solutions to improve prompt-image consistency suffer from the following challenges: (1) they oftentimes require model fine-tuning, (2) they only focus on nearby prompt samples, and (3) they are affected by unfavorable trade-offs among image quality, representation diversity, and prompt-image consistency. In this paper, we address these challenges and introduce a T2I optimization-by-prompting framework, OPT2I, which leverages a large language model (LLM) to improve prompt-image consistency in T2I models. Our framework starts from a user prompt and iteratively generates revised prompts with the goal of maximizing a consistency score. Our extensive validation on two datasets, MSCOCO and PartiPrompts, shows that OPT2I can boost the initial consistency score by up to 24.9% in terms of DSG score while preserving the FID and increasing the recall between generated and real data. Our work paves the way toward building more reliable and robust T2I systems by harnessing the power of LLMs.
- North America > Canada > Quebec > Montreal (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > Montserrat (0.04)
Understanding Object Detection
Imagine Google Photos: for all of the pictures you have, how do you label those by objects. Do you want to tag them one by one? How about automated cars driving? How do they detect pedestrians, cars, traffic lights, and impending obstacles? In recent years, image classification has gained huge traction especially with CNN and disruptive applications (e.g: self driving cars).
- Information Technology (1.00)
- Transportation > Ground > Road (0.90)
R-CNN object detection with Keras, TensorFlow, and Deep Learning - PyImageSearch
In this tutorial, you will learn how to build an R-CNN object detector using Keras, TensorFlow, and Deep Learning. Today's tutorial is the final part in our 4-part series on deep learning and object detection: What if we wanted to train an object detection network on our own custom datasets? How can we train that network using Selective Search search? And how will using Selective Search change our object detection inference script? In fact, these are the same questions that Girshick et al. had to consider in their seminal deep learning object detection paper Rich feature hierarchies for accurate object detection and semantic segmentation. Each of these questions will be answered in today's tutorial -- and by the time you're done reading it, you'll have a fully functioning R-CNN, similar (yet simplified) to the one Girshick et al. implemented! To learn how to build an R-CNN object detector using Keras and TensorFlow, just keep reading.
What is AI, ML, neural networks, deep learning and random forests
Artificial intelligence or AI, the phrase/acronym gets banded about an awful lot. Unfortunately, a lot of companies that say they use AI don't. It's a marketing gimmick -- maybe they are even targeting shareholders, trying to push up their share price. In fact, many experts in AI don't even like the term, they much prefer to use the words machine and learning, or ML. Delve deeper, and you come across many more terms, neural networks, deep learning, natural language processing and random forests.
Why the future of AI reminds me of the movie 'Arrival'
Just a few weeks ago, I read that the three top British AI companies to watch are Alphabet subsidiary DeepMind, cyber security firm DarkTrace and Blippar. Not sure why Blippar was on the list, it was after-all an augmented reality company. But what we can now say is that Blippar is in trouble -- it may not have gone to meet its maker, but it has met its administrators. I can see an argument that follows on from this which suggests that the future of AI is worrisome. After-all, there are clear parallels with boo.com.
The unfiltered joy of Christine McConnell's 'Mortal Kombat' cake
Rose is an obese, Frankenstein raccoon with a pink bow on top of her ratty head and a bent fork where her left hand should be. She's died at least twice, and each time, she's been lovingly brought back to life by her creator, Christine McConnell. Rose is one of the fantastic puppet friends in The Curious Creations of Christine McConnell, a Netflix series whose debut season helped define Halloween 2018. It stars McConnell, an endlessly creative baker whose online fame has exploded over the past five years, and a cast of puppet creatures produced by The Jim Henson Company. There's Edgar the bumbling werewolf, Rankle the resurrected, mummified cat god, and, of course, Rose the taxidermied raccoon.
- Media > Film (1.00)
- Leisure & Entertainment (1.00)
- Media > Television (0.92)