Goto

Collaborating Authors

 playback speed


AIx Speed: Playback Speed Optimization Using Listening Comprehension of Speech Recognition Models

arXiv.org Artificial Intelligence

Since humans can listen to audio and watch videos at faster speeds than actually observed, we often listen to or watch these pieces of content at higher playback speeds to increase the time efficiency of content comprehension. To further utilize this capability, systems that automatically adjust the playback speed according to the user's condition and the type of content to assist in more efficient comprehension of time-series content have been developed. However, there is still room for these systems to further extend human speed-listening ability by generating speech with playback speed optimized for even finer time units and providing it to humans. In this study, we determine whether humans can hear the optimized speech and propose a system that automatically adjusts playback speed at units as small as phonemes while ensuring speech intelligibility. The system uses the speech recognizer score as a proxy for how well a human can hear a certain unit of speech and maximizes the speech playback speed to the extent that a human can hear. This method can be used to produce fast but intelligible speech. In the evaluation experiment, we compared the speech played back at a constant fast speed and the flexibly speed-up speech generated by the proposed method in a blind test and confirmed that the proposed method produced speech that was easier to listen to.


How I Taught My Computer to Write Its Own Music - Issue 79: Catalysts

Nautilus

On a warm day in April 2013, I was sitting in a friend's kitchen in Paris, trying to engineer serendipity. I was trying to get my computer to write music on its own. I wanted to be able to turn it on and have it spit out not just any goofy little algorithmic tune but beautiful, compelling, mysterious music; something I'd be proud to have written myself. The kitchen window was open, and as I listened to the sounds of children playing in the courtyard below, I thought about how the melodies of their voices made serendipitous counterpoint with the songs of nearby birds and the intermittent drone of traffic on the rue d'Alésia. In response to these daydreams, I was making a few tweaks to my software--a chaotic, seat-of-the-pants affair that betrayed my intuitive, self-taught approach to programming--when I saw that Bill Seaman had just uploaded a new batch of audio files to our shared Dropbox folder. I had been collaborating with Bill, a media artist, on various aspects of computational creativity over the past few years. I loaded Bill's folder of sound files along with some of my own into the software and set it rolling. I was thrilled and astonished.


Using an AI creativity system to explore how aesthetic experiences are processed along the brains perceptual neural pathways

arXiv.org Artificial Intelligence

With the increased sophistication of AI techniques, the application of these systems has been expanding to ever newer fields. Increasingly, these systems are being used in modeling of human aesthetics and creativity, e.g. how humans create artworks and design products. Our lab has developed one such AI creativity deep learning system that can be used to create artworks in the form of images and videos. In this paper, we describe this system and its use in studying the human visual system and the formation of aesthetic experiences. Specifically, we show how time-based AI created media can be used to explore the nature of the dual-pathway neuro-architecture of the human visual system and how this relates to higher cognitive judgments such as aesthetic experiences that rely on these divergent information streams. We propose a theoretical framework for how the movement within percepts such as video clips, causes the engagement of reflexive attention and a subsequent focus on visual information that are primarily processed via the dorsal stream, thereby modulating aesthetic experiences that rely on information relayed via the ventral stream. We outline our recent study in support of our proposed framework, which serves as the first study that investigates the relationship between the two visual streams and aesthetic experiences.


10 digital tricks you'll wish you knew before now

FOX News

Remember shortcuts you could enact with a computer keyboard? Using a particular combination of CNTL, ALT, and FN, you could quickly scroll through documents, open and close programs, and move elements around. They still exist, of course. Here's a whole bunch of Facebook keyboard shortcuts you never knew existed. Nowadays, with so many tricks, niches, and shortcuts, those special operations look downright quaint.


10 tech tips for Google, Instagram and Facebook to get your digital life in order

USATODAY - Tech Top Stories

Remember shortcuts you could enact with a computer keyboard? Using a particular combination of CNTL, ALT and FN, you could quickly scroll through documents, open and close programs, and move elements around. They still exist, of course. But nowadays, with so many tricks, niches and shortcuts, those special operations look downright quaint. Many operations are well known, and most of them are tiny and frivolous.


How I Taught My Computer to Write Its Own Music - Issue 50: Emergence

Nautilus

On a warm day in April 2013, I was sitting in a friend's kitchen in Paris, trying to engineer serendipity. I was trying to get my computer to write music on its own. I wanted to be able to turn it on and have it spit out not just any goofy little algorithmic tune but beautiful, compelling, mysterious music; something I'd be proud to have written myself. The kitchen window was open, and as I listened to the sounds of children playing in the courtyard below, I thought about how the melodies of their voices made serendipitous counterpoint with the songs of nearby birds and the intermittent drone of traffic on the rue d'Alésia. In response to these daydreams, I was making a few tweaks to my software--a chaotic, seat-of-the-pants affair that betrayed my intuitive, self-taught approach to programming--when I saw that Bill Seaman had just uploaded a new batch of audio files to our shared Dropbox folder. I had been collaborating with Bill, a media artist, on various aspects of computational creativity over the past few years. I loaded Bill's folder of sound files along with some of my own into the software and set it rolling. I was thrilled and astonished.