Shriram, Jaidev
RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion
Shriram, Jaidev, Trevithick, Alex, Liu, Lingjie, Ramamoorthi, Ravi
We introduce RealmDreamer, a technique for generation of general forward-facing 3D scenes from text descriptions. Our technique optimizes a 3D Gaussian Splatting representation to match complex text prompts. We initialize these splats by utilizing the state-of-the-art text-to-image generators, lifting their samples into 3D, and computing the occlusion volume. We then optimize this representation across multiple views as a 3D inpainting task with image-conditional diffusion models. To learn correct geometric structure, we incorporate a depth diffusion model by conditioning on the samples from the inpainting model, giving rich geometric structure. Finally, we finetune the model using sharpened samples from image generators. Notably, our technique does not require video or multi-view data and can synthesize a variety of high-quality 3D scenes in different styles, consisting of multiple objects. Its generality additionally allows 3D synthesis from a single image.
Queer In AI: A Case Study in Community-Led Participatory AI
QueerInAI, Organizers Of, :, null, Ovalle, Anaelia, Subramonian, Arjun, Singh, Ashwin, Voelcker, Claas, Sutherland, Danica J., Locatelli, Davide, Breznik, Eva, Klubiฤka, Filip, Yuan, Hang, J, Hetvi, Zhang, Huan, Shriram, Jaidev, Lehman, Kruno, Soldaini, Luca, Sap, Maarten, Deisenroth, Marc Peter, Pacheco, Maria Leonor, Ryskina, Maria, Mundt, Martin, Agarwal, Milind, McLean, Nyx, Xu, Pan, Pranav, A, Korpan, Raj, Ray, Ruchira, Mathew, Sarah, Arora, Sarthak, John, ST, Anand, Tanvi, Agrawal, Vishakha, Agnew, William, Long, Yanan, Wang, Zijie J., Talat, Zeerak, Ghosh, Avijit, Dennler, Nathaniel, Noseworthy, Michael, Jha, Sharvani, Baylor, Emi, Joshi, Aditya, Bilenko, Natalia Y., McNamara, Andrew, Gontijo-Lopes, Raphael, Markham, Alex, Dวng, Evyn, Kay, Jackie, Saraswat, Manu, Vytla, Nikhil, Stark, Luke
We present Queer in AI as a case study for community-led participatory design in AI. We examine how participatory design and intersectional tenets started and shaped this community's programs over the years. We discuss different challenges that emerged in the process, look at ways this organization has fallen short of operationalizing participatory and intersectional principles, and then assess the organization's impact. Queer in AI provides important lessons and insights for practitioners and theorists of participatory methods broadly through its rejection of hierarchy in favor of decentralization, success at building aid and programs by and for the queer community, and effort to change actors and institutions outside of the queer community. Finally, we theorize how communities like Queer in AI contribute to the participatory design in AI more broadly by fostering cultures of participation in AI, welcoming and empowering marginalized participants, critiquing poor or exploitative participatory practices, and bringing participation to institutions outside of individual research projects. Queer in AI's work serves as a case study of grassroots activism and participatory methods within AI, demonstrating the potential of community-led participatory methods and intersectional praxis, while also providing challenges, case studies, and nuanced insights to researchers developing and using participatory methods.
Sonus Texere! Automated Dense Soundtrack Construction for Books using Movie Adaptations
Shriram, Jaidev, Tapaswi, Makarand, Alluri, Vinoo
Reading, much like music listening, is an immersive experience that transports readers while taking them on an emotional journey. Listening to complementary music has the potential to amplify the reading experience, especially when the music is stylistically cohesive and emotionally relevant. In this paper, we propose the first fully automatic method to build a dense soundtrack for books, which can play high-quality instrumental music for the entirety of the reading duration. Our work employs a unique text processing and music weaving pipeline that determines the context and emotional composition of scenes in a chapter. This allows our method to identify and play relevant excerpts from the soundtrack of the book's movie adaptation. By relying on the movie composer's craftsmanship, our book soundtracks include expert-made motifs and other scene-specific musical characteristics. We validate the design decisions of our approach through a perceptual study. Our readers note that the book soundtrack greatly enhanced their reading experience, due to high immersiveness granted via uninterrupted and style-consistent music, and a heightened emotional state attained via high precision emotion and scene context recognition.